0

我正在导入一个数据集并尝试输出一些文本分析。但是,我只能让它输出最后一列数据。我应该将 csv.writer 放在哪里才能获取所有代码行?

from __future__ import division
import csv
import re
from string import punctuation

faithwords = ['church', 'faith', 'faith']

with open('dataset.csv', 'rb') as csvfile:
    data = csv.reader(csvfile, delimiter=",")

    for row in data:

        faithcounter = 0 

        row3 = row[3]
        row3 = row3.lower().replace('  ', ' ')
        row4 = row[4]
        row4 = row4.lower().replace('  ', ' ')

        for p in list(punctuation):
            row3 = row3.replace(p, '')
            row4 = row4.replace(p, '')

        essay1= re.split(' ', row3)
        essay2= re.split(' ', row4)

        essay1len = len(essay1)
        essay2len = len(essay2)

        num_of_rows = len(row)

        for word in essay1:
            if word in faithwords:
                faithcounter = faithcounter + 1  

        for word in essay2:
            if word in faithwords:
                faithcounter = faithcounter + 1            

        totallen = (essay2len + essay1len)

        row.append(essay1len)
        row.append(essay2len)
        row.append(totallen)
        row.append(faithcounter)        
        row.append(faithcounter / totallen)

        output = zip(row)

writer = csv.writer(open('csvoutput.csv', 'wb'))
writer.writerows(output)
4

2 回答 2

0

我建议将其删除output=zip(row)并替换为writer.write(row)

移除writer.writerows(output)并放在writer = csv.writer(open('csvoutput.csv', 'wb')) 循环上方。

于 2013-10-23T17:56:52.837 回答
0

您的问题出在这一行:

output=zip(row)

我不确定你为什么这样做zip,但我知道你output在每次迭代时都会覆盖。

我建议您在循环之前创建 csv 编写器。然后,作为循环中的最后一条语句,执行:

writer.writerow(row)
于 2013-10-23T17:54:56.250 回答