完整来源:https ://gist.github.com/dimitko/5541709 。您可以下载并直接运行它而没有任何麻烦(只要确保拥有awesome_print
gem;它以更易于阅读的格式向您显示对象层次结构)。
我稍微丰富了你的测试输入,以确保算法不会犯愚蠢的错误。
鉴于此输入:
input = <<TEXT
2 1 app/assets/javascripts/bar.js
16 25 app/assets/javascripts/baz.js.coffee
1 1 app/assets/javascripts/foo.js.coffee
4 9 app/controllers/bar_controller.rb
3 2 app/controllers/baz_controller.rb
11 0 app/controllers/foo_controller.rb
3 2 db/schema.rb
41 1 lib/foobar.rb
12 7 lib/tasks/cache.rake
5 13 lib/tasks/import.rake
TEXT
而这个预期的结果:
[{:name=>"app", :add=>37, :del=>38, :children=>[{:name=>"assets", :add=>19, :del=>27, :children=>[{:name=>"javascripts", :add=>19, :del=>27, :children=>[{:name=>"bar.js", :add=>2, :del=>1}, {:name=>"baz.js.coffee", :add=>16, :del=>25}, {:name=>"foo.js.coffee", :add=>1, :del=>1}]}]}, {:name=>"controllers", :add=>18, :del=>11, :children=>[{:name=>"bar_controller.rb", :add=>4, :del=>9}, {:name=>"baz_controller.rb", :add=>3, :del=>2}, {:name=>"foo_controller.rb", :add=>11, :del=>0}]}]}, {:add=>3, :del=>2, :name=>"db", :children=>[{:name=>"schema.rb", :add=>3, :del=>2}]}, {:add=>58, :del=>21, :name=>"lib", :children=>[{:name=>"foobar.rb", :add=>41, :del=>1}, {:name=>"tasks", :add=>17, :del=>20, :children=>[{:name=>"cache.rake", :add=>12, :del=>7}, {:name=>"import.rake", :add=>5, :del=>13}]}]}]
而这段代码:
def git_diffnum_parse_paths(list, depth, out)
to = 1
base = list.first[:name][depth]
while list[to] and list[to][:name][depth] == base do
to += 1
end
if list.first[:name][depth+1]
out << {name: base, add: 0, del: 0, children: []}
# Common directory found for the first N records; recurse deeper.
git_diffnum_parse_paths(list[0..to-1], depth + 1, out.last[:children])
add = del = 0
out.last[:children].each do |x| add += x[:add].to_i; del += x[:del].to_i; end
out.last[:add] = add
out.last[:del] = del
else
# It's a file, we can't go any deeper.
out << {name: list.first[:name].last, add: list.first[:add].to_i, del: list.first[:del].to_i}
end
if list[to]
# Recurse in to try find common directories for the deeper records.
git_diffnum_parse_paths(list[to..-1], depth, out)
end
nil
end
def to_git_diffnum_tree(txt)
items = []
txt.split("\n").each do |line|
m = line.match(/(\d+)\s+(\d+)\s+(.+)/).to_a[1..3]
items << {add: m[0], del: m[1], name: m[2]}
end
items.sort! { |a,b|
a[:name] <=> b[:name]
}
items.each do |item|
item[:name] = item[:name].split("/")
end
out = []
git_diffnum_parse_paths(items, 0, out)
out
end
而这段代码,它正在使用它:
require 'awesome_print'
out = to_git_diffnum_tree(input)
puts; ap out; puts
puts; puts "Expected result:"; puts expected.inspect
puts; puts "Actual result: "; puts out.inspect
puts; puts "Are expected and actual results identical: #{expected == out}"
它似乎产生了你想要的东西。
笔记:
- 我正在按目录/文件名对已解析条目的数组进行排序。这样做是为了避免遍历整个列表来搜索公共目录;相反,该算法可以扫描列表直到第一个不匹配。
- 我远非认为这是最佳解决方案,但这是我在空闲时间想出的。
- 我
puts
在要点中留下了一些 [未] 注释的陈述,以防你想大致了解算法是如何工作的。
- 如果您想对其进行更可靠的测试,请尝试以下操作:
git diff --numstat `git rev-list --max-parents=0 HEAD | 头 -n 1`头
这将为您提供自初始提交以来的添加和删除数量(前提是您的 Git 版本 >=1.7.4.2),这是一个更大的输入,您可以在其中对算法进行更严格的测试。
希望我有所帮助。