0

我有一个非常大的文本文件,其内容如下:

@INBOOK{Ackermann1999-b, 
  author = {Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, 
        K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. 
        and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and 
        Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, 
        K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. 
        and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and 
        Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, 
        K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. 
        and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and 
        Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann}, 
  year = {1980}, 
  timestamp = {1995-12-02} 
}      

我想删除重复的行,除了这些包含括号 { 或 } 的行。结果应如下所示:

@INBOOK{Ackermann1999-b, 
  author = {Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, 
        Ackermann, K.-F. and Ackermann, K.-F. and Ackermann, K.-F. and Ackermann}, 
  year = {1980}, 
  timestamp = {1995-12-02} 
} 

感谢Vinay Sajip,我遇到了这个Python-Skript:

lines_seen = set() # holds lines already seen 
outfile = open("literatur_clean.txt", "w") 
for line in open("literatur_dupl.txt", "r"): 
    if line not in lines_seen: # not a duplicate 
        outfile.write(line) 
        lines_seen.add(line) 
outfile.close() 

但它也会删除带有右括号 } 的行和具有相同作者数据的行。因此我需要括号的条件。

有人可以指出我添加这个条件吗?

提前致谢,

4

1 回答 1

2
if ('{' in line or '}' in line) and line not in lines_seen: # not a duplicate 
于 2012-10-10T08:59:26.343 回答