我有一个包含 3 列的 CSV 文件,如下所示:
a,b,c
1,1,2
1,3,5
1,5,7
.
.
2,3,4
2,1,5
2,4,7
我希望输出像
a,b,c
1,5,7
1,3,5
1,1,2
.
.
2,4,7
2,3,4
2,1,5
即,对于a 列中的每个元素,我只想拥有前20 行(20 个最高'b' 值)行。请原谅我拙劣的解释。到目前为止我已经尝试过了,但这并没有给我所需的输出:
import csv
import heapq
from itertools import islice
csvout = open ("output.csv", "w")
writer = csv.writer(csvout, delimiter=',',quotechar='"', lineterminator='\n', quoting=csv.QUOTE_MINIMAL)
freqs = {}
with open('input.csv') as fin:
csvin = csv.reader(fin)
rows_with_mut = ([float(row[1])] + row for row in islice(csvin, 1, None) if row[2])
for row in rows_with_mut:
cnt = freqs.setdefault(row[0], [[]] * 20)
heapq.heappushpop(cnt, row)
for assay_id, vals in freqs.iteritems():
output = [row[1:] for row in sorted(filter(None, vals), reverse=True)]
writer.writerows(output)