我有一个 CSV,其中第 6 列代表该班级学生人数的计数。我还有一段单独的代码,如果他们出现在不同的脚本上,它会从班级中删除一些学生,我将如何重新计算每个班级的学生人数。请参阅下面的示例数据:
Jan-20,Data,Class xpv,4,11yo+,4,more data....
Jan-20,Data,Class xpv,4,11yo+,4,more data....
Jan-20,Data,Class xpv,4,11yo+,4,more data....
Jan-20,Data,Class xpv,4,11yo+,4,more data....
Jan-30,Data,Class tn2,4,10yo+,12,more data....
Jan-30,Data,Class tn2,4,10yo+,12,more data....
Jan-30,Data,Class tn2,4,10yo+,12,more data....
Jan-30,Data,Class tn2,4,10yo+,12,more data....
Jan-30,Data,Class tn2,4,10yo+,12,more data....
Jan-30,Data,Class tn2,4,10yo+,12,more data....
Jan-30,Data,Class tn2,4,10yo+,12,more data....
Jan-30,Data,Class tn2,4,10yo+,12,more data....
Jan-30,Data,Class tn2,4,10yo+,12,more data....
Jan-30,Data,Class tn2,4,10yo+,12,more data....
Jan-30,Data,Class tn2,4,10yo+,12,more data....
Jan-30,Data,Class tn2,4,10yo+,12,more data....
Jan-50,Data,Class 22zn,2,10yo+,6,more data....
Jan-50,Data,Class 22zn,2,10yo+,6,more data....
Jan-50,Data,Class 22zn,2,10yo+,6,more data....
Jan-50,Data,Class 22zn,2,10yo+,6,more data....
Jan-50,Data,Class 22zn,2,10yo+,6,more data....
Jan-50,Data,Class 22zn,2,10yo+,6,more data....
标识哪些行被删除的列在“更多数据”中结束,但是在删除任何一行后,我如何编写代码来计算该班级剩下的学生人数,基本上计算第 2 列并替换第 6 列中的值。 (这些类名都是唯一的)
我希望这是有道理的。任何帮助将不胜感激!亲切的问候 AEA
编辑 将上述数据保存为 AEAtest.csv
我尝试运行以下代码:
import csv
import itertools
from operator import itemgetter
import random
def some_condition(line):
return random.random() < 0.5 # delete lines randomly with 50% probability
def filter_data(data):
for classname, group in itertools.groupby(data, itemgetter(2)):
filtered_group = [line for line in group if some_condition(line)]
new_sum = len(filtered_group)
for line in filtered_group:
line[5] = new_sum
yield line
with open('C:\AEAtest.csv') as f_in, open('C:\AEAtest_MOD.csv', 'w') as f_out:
reader = csv.reader(f_in)
writer = csv.writer(f_out)
writer.writerows(filter_data(reader))
输出如下:
Jan-20,Data,Class xpv,4,11yo+,2,more data....
Jan-20,Data,Class xpv,4,11yo+,2,more data....
Jan-30,Data,Class tn2,4,10yo+,7,more data....
Jan-30,Data,Class tn2,4,10yo+,7,more data....
Jan-30,Data,Class tn2,4,10yo+,7,more data....
Jan-30,Data,Class tn2,4,10yo+,7,more data....
Jan-30,Data,Class tn2,4,10yo+,7,more data....
Jan-30,Data,Class tn2,4,10yo+,7,more data....
Jan-30,Data,Class tn2,4,10yo+,7,more data....
Jan-50,Data,Class 22zn,2,10yo+,3,more data....
Jan-50,Data,Class 22zn,2,10yo+,3,more data....
Jan-50,Data,Class 22zn,2,10yo+,3,more data....
我想知道额外的行现在是如何出现的,有趣的是,上面的最后一行文本是第 23 行,然后是另外两个空行。
有关修复此错误的任何帮助?亲切的问候 AEA