3

我想从文件中删除重复的行,但只删除与特定正则表达式匹配的重复行,而将所有其他重复项留在文件中。这是我目前拥有的:

unique_lines = File.readlines("Ops.Web.csproj").uniq do |line|    
  line[/^.*\sInclude=\".*\"\s\/\>$/]
end

File.open("Ops.Web.csproj", "w+") do |file|
  unique_lines.each do |line|
    file.puts line
  end
end

这将正确地对行进行重复数据删除,但只会将符合正则表达式的行添加回文件中。我需要将文件中的所有其他行原样添加回来。我知道我在这里遗漏了一些小东西。想法?

4

1 回答 1

4

尝试这个:

lines = File.readlines("input.txt")
out = File.open("output.txt", "w+")
seen = {}

lines.each do |line|
  # check if we want this de-duplicated
  if line =~ /Include/
    if !seen[line]
      out.puts line
      seen[line] = true
    end
  else
    out.puts line
  end
end

out.close

演示:

➜  12980122  cat input.txt
a
b
c
Include a
Include b
Include a
Include a
d
e
Include b
f
➜  12980122  ruby exec.rb
➜  12980122  cat output.txt
a
b
c
Include a
Include b
d
e
f
于 2012-10-19T18:28:19.167 回答