我正在尝试合并许多 CSV 文件。我的初始功能旨在:
- 查看目录内部并计算其中的文件数(假设所有文件都是 .csv)
- 打开第一个 CSV 并将每一行附加到列表中
- 剪掉前三行(有一些我不想要的无用的列标题信息)
- 将这些结果存储在我称为“存档”的列表中
- 打开下一个 CSV 文件并重复(剪辑并将 em 附加到“存档”)
- 当我们没有 CSV 文件时,我想将完整的“存档”写入单独文件夹中的文件。
例如,如果我从三个看起来像这样的 CSV 文件开始。
CSV 1
[]
[['Title'],['Date'],['etc']]
[]
[['Spam'],['01/01/2013'],['Spam is the spammiest spam']]
[['Ham'],['01/04/2013'],['ham is ok']]
[['Lamb'],['04/01/2013'],['Welsh like lamb']]
[['Sam'],['01/12/2013'],["Sam doesn't taste as good and the last three"]]
CSV 2
[]
[['Title'],['Date'],['etc']]
[]
[['Dolphin'],['01/01/2013'],['People might get angry if you eat it']]
[['Bear'],['01/04/2013'],['Best of Luck']]
CSV 3
[]
[['Title'],['Date'],['etc']]
[]
[['Spinach'],['04/01/2013'],['Spinach has lots of iron']]
[['Melon'],['02/06/2013'],['Not a big fan of melon']]
最后我会回家得到类似的东西......
CSV 输出
[['Spam'],['01/01/2013'],['Spam is the spammiest spam']]
[['Ham'],['01/04/2013'],['ham is ok']]
[['Lamb'],['04/01/2013'],['Welsh like lamb']]
[['Sam'],['01/12/2013'],["Sam doesn't taste as good and the last three"]]
[['Dolphin'],['01/01/2013'],['People might get angry if you eat it']]
[['Bear'],['01/04/2013'],['Best of Luck']]
[['Spinach'],['04/01/2013'],['Spinach has lots of iron']]
[['Melon'],['02/06/2013'],['Not a big fan of melon']]
所以......我开始写这个:
import os
import csv
path = './Path/further/into/file/structure'
directory_list = os.listdir(path)
directory_list.sort()
archive = []
for file_name in directory_list:
temp_storage = []
path_to = path + '/' + file_name
file_data = open(path_to, 'r')
file_CSV = csv.reader(file_data)
for row in file_CSV:
temp_storage.append(row)
for row in temp_storage[3:-1]:
archive.append(row)
archive_file = open("./Path/elsewhere/in/file/structure/archive.csv", 'wb')
wr = csv.writer(archive_file)
for row in range(len(archive)):
lastrow = row
wr.writerow(archive[row])
print row
这似乎有效......除了当我检查我的输出文件时,它似乎已经停止在接近末尾的一个奇怪点写入“
例如:
[['Spam'],['01/01/2013'],['Spam is the spammiest spam']]
[['Ham'],['01/04/2013'],['ham is ok']]
[['Lamb'],['04/01/2013'],['Welsh like lamb']]
[['Sam'],['01/12/2013'],['Sam doesn't taste as good and the last three']]
[['Dolphin],['01/01/2013'],['People might get angry if you eat it']]
[['Bear'],['01/04/2013'],['Best of Luck']]
[['Spinach'],['04/0
这真的很奇怪,我无法弄清楚出了什么问题。似乎写得很好,但决定在列表条目中途停止。追溯它,我确信这与我上次写的“for循环”有关,但我不太熟悉 csv 方法。已经阅读了文档,但我仍然很难过。
谁能指出我哪里出错了,我该如何解决它,也许有更好的方法来解决这一切!
非常感谢-Huw