我有一个存储库,其中包含许多不再位于工作目录中的文件——在存储库的几个月/几年中添加和删除的文件。
我想制作一个包含所有这些文件的列表的文件,这些文件存储在提交历史中但不再需要,包括它们的位置..即
/web/scripts/index.php
/sql/tables.sql
...
然后我想要一个运行该文件并从提交历史记录中完全删除其中引用的文件的命令,就像git rm --cached
文件列表一样。
别名David Underhill 的脚本,然后运行(谨慎):
$ git delete `git log --all --pretty=format: --name-only --diff-filter=D`
David Underhill 的命令用于filter-branch
修改存储库的历史记录,删除给定文件路径的所有历史记录。
整个脚本(来源):
#!/bin/bash
set -o errexit
# Author: David Underhill
# Script to permanently delete files/folders from your git repository. To use
# it, cd to your repository's root and then run the script with a list of paths
# you want to delete, e.g., git-delete-history path1 path2
if [ $# -eq 0 ]; then
exit 0
fi
# make sure we're at the root of git repo
if [ ! -d .git ]; then
echo "Error: must run this script from the root of a git repository"
exit 1
fi
# remove all paths passed as arguments from the history of the repo
files=$@
git filter-branch --index-filter "git rm -rf --cached --ignore-unmatch $files" HEAD
# remove the temporary history git-filter-branch otherwise leaves behind for a long time
rm -rf .git/refs/original/ && git reflog expire --all && git gc --aggressive --prune
将此脚本保存到硬盘驱动器上的某个位置(例如/path/to/deletion_script.sh
),并确保它是可执行的(chmod +x /path/to/deletion_script.sh
)。
然后别名命令:
$ git config --global alias.delete '!/path/to/deletion_script.sh'
要获取所有已删除文件的排序列表:
$ git log --all --pretty=format: --name-only --diff-filter=D | sort -u
使用已删除文件的列表,只需连接git delete
处理列表中的每个文件:
$ git delete `git log --all --pretty=format: --name-only --diff-filter=D`
创建一个包含添加、重命名和删除的虚拟存储库:
mkdir test_repo
cd test_repo/
git init
echo "Dummy content" >> stays.txt
git add stays.txt && git commit -m "First file, will stay"
echo "Rename content" >> will_rename.txt
git add will_rename.txt && git commit -m "Going to rename"
echo "Delete this file" >> will_delete.txt
git add will_delete.txt && git commit -m "Delete this file"
git mv will_rename.txt renamed.txt && git commit -m "File renamed"
git rm will_delete.txt && git commit -m "File deleted"
检查历史:
$ git whatchanged --oneline
d768c58 File deleted
:100644 000000 7a4187c... 0000000... D will_delete.txt
96aadf0 File renamed
:000000 100644 0000000... 94a12c7... A renamed.txt
:100644 000000 94a12c7... 0000000... D will_rename.txt
3ba05fa Delete this file
:000000 100644 0000000... 7a4187c... A will_delete.txt
c88850a Going to rename
:000000 100644 0000000... 94a12c7... A will_rename.txt
6db6015 First file, will stay
:000000 100644 0000000... f3ae800... A stays.txt
删除旧文件:
$ git delete `git log --all --pretty=format: --name-only --diff-filter=D`
Rewrite 8c2009db5ac05b27cd065482da94dec717f5ef4a (8/9)rm 'will_delete.txt'
Rewrite e1348d588597f2f6dd63cade081e0fbdf8692c74 (9/9)
Ref 'refs/heads/master' was rewritten
Counting objects: 27, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (22/22), done.
Writing objects: 100% (27/27), done.
Total 27 (delta 12), reused 10 (delta 0)
现在检查存储库。请注意,删除已从历史记录中删除,重命名看起来好像文件最初是以这种方式添加的。
c800020 File renamed
:000000 100644 0000000... 94a12c7... A renamed.txt
0a729d7 First file, will stay
:000000 100644 0000000... f3ae800... A stays.txt
添加到@David的答案中,如果您想格外小心并确保不会删除随后在历史记录中添加的任何文件,请使用以下命令块而不是git delete $(git log --all --pretty=format: --name-only --diff-filter=D)
(考虑将其添加为你的.bashrc
):
current=($(git ls-files))
tracked=($(git log --all --pretty=format: --name-only --diff-filter=D | xargs))
deleted=()
resurrected=()
for file in "${tracked[@]}"; do
if [[ " ${current[@]} " =~ " $file " ]]; then
resurrected+=("$file")
else
deleted+=("$file");
fi
done
echo "Deleted: ${deleted[@]}"
echo "Resurrected: ${resurrected[@]}"
git delete "${deleted[@]}"