对于需要排除移动文本的某些用例,上述答案失败(例如,如果我将代码中的函数或乳胶中的段落移动到文档下方,我不想将所有这些都计为更改!)
为此,您还可以计算重复行数,如果重复行数过多,则从查询中排除这些行数。
例如,在其他答案的基础上,我可以这样做:
git diff $sha~1..$sha|grep -e"^+[^+]" -e"^-[^-]"|sed -e's/.//'|sort|uniq -d|wc -w|xargs
计算差异中重复单词的数量,sha
您的提交在哪里。
您可以通过以下方式对最后一天(从早上 6 点开始)内的所有提交执行此操作:
for sha in $(git rev-list --since="6am" master | sed -e '$ d'); do
echo $(git diff --word-diff=porcelain $sha~1..$sha|grep -e"^+[^+]"|wc -w|xargs),\
$(git diff --word-diff=porcelain $sha~1..$sha|grep -e"^-[^-]"|wc -w|xargs),\
$(git diff $sha~1..$sha|grep -e"^+[^+]" -e"^-[^-]"|sed -e's/.//'|sort|uniq -d|wc -w|xargs)
done
打印:添加、删除、重复
(我将行 diff 用于重复,因为它排除了git diff
试图过于聪明的时间,并假设您实际上只是更改了文本而不是移动它。它还忽略了单个单词被视为重复的实例。)
或者,如果你想更复杂一点,如果重复超过 80%,你可以完全排除提交,然后总结其余部分:
total=0
for sha in $(git rev-list --since="6am" master | sed -e '$ d'); do
added=$(git diff --word-diff=porcelain $sha~1..$sha|grep -e"^+[^+]"|wc -w|xargs)
deleted=$(git diff --word-diff=porcelain $sha~1..$sha|grep -e"^-[^-]"|wc -w|xargs)
duplicated=$(git diff $sha~1..$sha|grep -e"^+[^+]" -e"^-[^-]"|sed -e's/.//'|sort|uniq -d|wc -w|xargs)
if [ "$added" -eq "0" ]; then
changed=$deleted
total=$((total+deleted))
echo "added:" $added, "deleted:" $deleted, "duplicated:"\
$duplicated, "changed:" $changed
elif [ "$(echo "$duplicated/$added > 0.8" | bc -l)" -eq "1" ]; then
echo "added:" $added, "deleted:" $deleted, "duplicated:"\
$duplicated, "changes counted:" 0
else
changed=$((added+deleted))
total=$((total+changed))
echo "added:" $added, "deleted:" $deleted, "duplicated:"\
$duplicated, "changes counted:" $changed
fi
done
echo "Total changed:" $total
我在这里有这个脚本:https ://github.com/MilesCranmer/git-stats 。
这打印出来:
➜ bifrost_paper git:(master) ✗ count_changed_words "6am"
added: 38, deleted: 76, duplicated: 3, changes counted: 114
added: 14, deleted: 19, duplicated: 0, changes counted: 33
added: 1113, deleted: 1112, duplicated: 1106, changes counted: 0
added: 1265, deleted: 1275, duplicated: 1225, changes counted: 0
added: 4207, deleted: 4208, duplicated: 4391, changes counted: 0
Total changed: 147
我只是在四处移动的提交是显而易见的,所以我不计算这些变化。它计算其他所有内容并告诉我更改的单词总数。