1

我有一些 csv 格式的数据,我需要将第 3 列乘以第 6 列并将结果附加到末尾。我的数据如下:

 TITLE,TITLE,T,T,T,T
 data,data,5,data,data,98.7,data
 data,data,2,data,data,97,data
 data,data,5,data,data,98,data
 data,data,4,data,data,8.7,data
 data,data,5,data,data,9.7,data
 data,data,12.5,data,data,198.7,data

我对编码真的很陌生,但我的尝试如下: import csv import datetime import copy from collections import defaultdict

class_col = 2
data_col = 5

with open('minitest.csv', 'r') as f:
    data = [line.strip().split(',') for line in f]

for row in data:
    class_col*data_col

with open('minitest_edit.csv', 'w') as nf:
    nf.write('\n'.join(','.join(row) for row in data))

print "done"

我没有收到任何错误,有什么建议吗?感谢操作系统”

4

4 回答 4

5

使用csv.readercsv.writer

import csv


with open('minitest.csv', 'rb') as f:
    reader = csv.reader(f)
    data = [next(reader)]  # title row
    for row in reader:
        data.append(row + [float(row[2]) * float(row[5])])

with open('minitest.csv', 'wb') as nf:
    writer = csv.writer(nf)
    writer.writerows(data)

产生:

TITLE,TITLE,T,T,T,T
data,data,5,data,data,98.7,data,493.5
data,data,2,data,data,97,data,194.0
data,data,5,data,data,98,data,490.0
data,data,4,data,data,8.7,data,34.8
data,data,5,data,data,9.7,data,48.5
data,data,12.5,data,data,198.7,data,2483.75
于 2013-10-01T10:49:24.950 回答
4

您希望以下内容将您的列相乘并将它们附加到原始列:

new_data = []
for row in data:
    new_data.append(row + [float(row[class_col]) * float(row[data_col])])
于 2013-10-01T10:44:42.800 回答
3

这是一种更有效的方法,即不读取内存中的数据,这对于更大的数据集很重要:

import csv
import tempfile
import shutil

input_file = 'minitest.csv'

with open(input_file, 'rb') as f, \
     tempfile.NamedTemporaryFile(delete=False) as out_f:

    # in order to be able to not have to read everything in memory, we have to 
    # write every processed row to disk immediatley; for that, we need a temporary
    # file because we can't read and write a single file at the same time:
    reader = csv.reader(f)
    writer = csv.writer(out_f)

    # header row
    writer.writerow(next(reader))

    # note that this uses a generator not a list, so that writerows will lazily
    # evaluate each row as it writes them to disk
    writer.writerows(row + [float(row[2]) * float(row[5])] for row in reader)

# one everything's done, overwrite the original file with the new contents.
shutil.move(out_f.name, input_file)

PS如果你不写回同一个文件实际上会更简单 - 你总是可以在处理代码完成后手动移动,以保持处理代码更简单。

于 2013-10-01T11:31:25.403 回答
0

如果你幸运并且在 LUNIX 上编程,你可以使用 awk:

awk -F "\"*,\"*" '{print $0 "," $3*$6 }' test.csv > result.csv

您可以使用 Python http://docs.python.org/2/library/subprocess.html调用它。

于 2013-10-03T11:11:32.947 回答