我们如何根据 TSV 文件中的列索引解析数据?一旦我们从文件中读取数据,那么我们必须检查第 0 列第 1 行数据和第 0 列第 2 行数据,如果匹配,则获取第 1 列第 1 行数据,并且需要在第 1 列第 1 行中附加所有匹配条目。
例如,SystemType.tsv 文件
Actrius 1990s drama films
Actrius Catalan language films
Actrius Spanish films
Actrius Barcelona in fiction
Actrius Films directed by Ventura Pons
Actrius 1996 films
An_American_in_Paris Compositions by George Gershwin
An_American_in_Paris Symphonic poems
An_American_in_Paris Grammy Hall of Fame Award recipients
在第 0 列第 1 行中存在“Actrius”,因此我们需要比较第 0 列中的所有行,并将匹配的条目第 1 列值以逗号分隔的形式放置,如下所示。
输出:
Actrius 1990s drama flims,Cataln language flims,Spanish flims,Barcelona in fiction,Films directed by Ventura Pons,1996 films
An_American_in_Paris Compositions by George Gershwin,Symphonic poems,Grammy Hall of Fame Award recipients
我已经尝试过这个,但对我不起作用。
def finalextract():
lines_seen = set()
outfile = open("Output.txt","w+")
infile = open("SystemType.tsv","r+")
for line in infile:
if line[0] == lines_seen[0]:
string = line[1]+','+lines_seen[1]
outfile.write(string)
lines_seen.add(string)
infile.close()
outfile.close()