python - Python-将多个文件导入单个 .csv 文件

Question

我有 125 个数据文件，其中包含两列和 21 行数据，我想将它们导入单个 .csv 文件（作为 125 对列和只有 21 行）。这是我的数据文件的样子：

在此处输入图像描述

我对python相当陌生，但我想出了以下代码：

import glob
Results = glob.glob('./*.data')
fout='c:/Results/res.csv'
fout=open ("res.csv", 'w')
 for file in Results:
 g = open( file, "r" )
 fout.write(g.read())
 g.close() 
fout.close()

上面代码的问题是所有数据都被复制到只有两列125*21行。

很感谢任何形式的帮助！

score 1 · Accepted Answer

这应该有效：

import glob

files = [open(f) for f in glob.glob('./*.data')] #Make list of open files
fout = open("res.csv", 'w')

for row in range(21):
    for f in files:
        fout.write( f.readline().strip() ) # strip removes trailing newline
        fout.write(',')
    fout.write('\n')

fout.close()

请注意，如果您尝试大量文件，此方法可能会失败，我相信 Python 中的默认限制是 256。

score 1 · Accepted Answer

您可能想尝试 python CSV 模块 (http://docs.python.org/library/csv.html)，它为读取和写入 CSV 文件提供了非常有用的方法。既然您说您只需要 21 行和 250 列数据，我建议您创建 21 个 python 列表作为您的行，然后在您遍历文件时将数据附加到每一行。

就像是：

import csv

rows = []
for i in range(0,21):
    row  = []
    rows.append(row)

#not sure the structure of your input files or how they are delimited, but for each one, as you have it open and iterate through the rows, you would want to append the values in each row to the end of the corresponding list contained within the rows list.

#then, write each row to the new csv:

writer = csv.writer(open('output.csv', 'wb'), delimiter=',')
for row in rows:
    writer.writerow(row)

score 1 · Accepted Answer

（抱歉，我还不能添加评论。）

[稍后编辑，以下语句是错误的！！！] “davesnitty 生成的行循环可以替换为rows = [[]] * 21。” 这是错误的，因为这将创建空列表的列表，但空列表将是外部列表的所有元素共享的单个空列表。

我对使用标准 csv 模块的 +1。但是文件应该总是关闭的——尤其是当你打开那么多文件时。此外，还有一个错误。通过 -- 即使您只在此处写入结果，从文件中读取的行也是如此。解决方案实际上是缺失的。基本上，从文件中读取的行应该附加到与行号相关的子列表中。行号应通过 enumerate(reader) 获得，其中 reader 为 csv.reader(fin, ...)。

[稍后添加]尝试以下代码，为您的 puprose 修复路径：

import csv
import glob
import os

datapath = './data'
resultpath = './result'
if not os.path.isdir(resultpath):
   os.makedirs(resultpath)

# Initialize the empty rows. It does not check how many rows are
# in the file.
rows = []

# Read data from the files to the above matrix.
for fname in glob.glob(os.path.join(datapath, '*.data')):
    with open(fname, 'rb') as f:
        reader = csv.reader(f)
        for n, row in enumerate(reader):
            if len(rows) < n+1:
                rows.append([])  # add another row
            rows[n].extend(row)  # append the elements from the file

# Write the data from memory to the result file.
fname = os.path.join(resultpath, 'result.csv')
with open(fname, 'wb') as f:
    writer = csv.writer(f)
    for row in rows:
        writer.writerow(row)

python - Python-将多个文件导入单个 .csv 文件

3 回答 3

Related

Reference