1

I am working on a group project and I want to remove a file from all memory. The content, the file name, everything! I don't want any trace of this left on the Git repo. I have been trying to do this using bfg but I can still find the file on the Github page using it's "browse the repository at this point in history feature".

The directory which is the git repo is .../electricity_profiles and within the directory electricity_profiles/data there was the file I want to remove (I've tried bfg --delete-files .~lock.smart_meter_data_overlap.csv#). I have removed it from the current commit since, but it is a few commits back commit 5c50c67d1be4e869bc75fb7d3916b9fc814b8106.

How can I remove all evidence this file ever existed, even on github, and so when other people pull the file they won't see it?

I have looked at:

but haven't figured it out yet.

Work done so far: (Seems to work).

git clone --mirror https://github.com/oliversheridanmethven/electricity_profiles.git
bfg --delete-files .~lock.smart_meter_data_overlap.csv# electricity_profiles.git

Console output:

Using repo : /home/user/Documents/InFoMM/case_studies/trial/electricity_profiles.git

Found 20 objects to protect
Found 2 commit-pointing refs : HEAD, refs/heads/master

Protected commits
-----------------

These are your protected commits, and so their contents will NOT be altered:

 * commit 1b1eef47 (protected by 'HEAD')

Cleaning
--------

Found 22 commits
Cleaning commits:       100% (22/22)
Cleaning commits completed in 141 ms.

Updating 1 Ref
--------------

    Ref                 Before     After   
    ---------------------------------------
    refs/heads/master | 1b1eef47 | 9701a5b7

Updating references:    100% (1/1)
...Ref update completed in 26 ms.

Commit Tree-Dirt History
------------------------

    Earliest        Latest
    |                    |
    ......D..D..m.m.mmmmmm

    D = dirty commits (file tree fixed)
    m = modified commits (commit message or parents changed)
    . = clean commits (no changes to file tree)

                            Before     After   
    -------------------------------------------
    First modified commit | 5c50c67d | ff47bcdf
    Last dirty commit     | 9671f6ad | f6d36763

Deleted files
-------------

    Filename                               Git id         
    ------------------------------------------------------
    .~lock.smart_meter_data_overlap.csv# | 7cf2b24f (92 B)


In total, 14 object ids were changed. Full details are logged here:

    /home/user/Documents/InFoMM/case_studies/trial/electricity_profiles.git.bfg-report/2017-01-18/11-48-37

BFG run is complete! When ready, run: git reflog expire --expire=now --all && git gc --prune=now --aggressive

finishing off the process.

cd electricity_profiles.git
git push --mirror https://github.com/oliversheridanmethven/electricity_profiles.git

Looking at the Github repo it seems to have worked.

4

1 回答 1

4

我是 BFG 的作者 - 我将您的问题重新命名为“为什么在使用 BFG 清理文件后仍然可以在 GitHub 历史记录中看到文件?” 因为它可能更好地代表您的问题。

您的问题描述并没有完全说明这一点,但我猜在 BFG 运行的报告中,BFG确实报告它已删除文件(如果 BFG 没有找到要删除的目标,它会报告为错误,并且您没有提到看到这一点,所以我的猜测是 BFG 确实找到了您的文件,并将它们从历史记录中删除)。

首先,您需要确保遵循https://rtyley.github.io/bfg-repo-cleaner/#usage上的所有步骤,特别是:

  • 你正在清理一个mirror仓库
  • 你把这个清理过的镜像仓库推回了 GitHub。

如果您正确执行了所有这些步骤,为什么在使用 BFG 清理文件后仍然可以在 GitHub 历史记录中看到文件?一个可能的解释是 GitHub 还没有对那个 repo 进行垃圾收集。GitHub 只定期进行 GC,因此旧的提交在之后的一段时间内仍然可见:

于 2017-01-18T09:48:24.487 回答