2

我正在尝试合并许多 CSV 文件。我的初始功能旨在:

  • 查看目录内部并计算其中的文件数(假设所有文件都是 .csv)
  • 打开第一个 CSV 并将每一行附加到列表中
  • 剪掉前三行(有一些我不想要的无用的列标题信息)
  • 将这些结果存储在我称为“存档”的列表中
  • 打开下一个 CSV 文件并重复(剪辑并将 em 附加到“存档”)
  • 当我们没有 CSV 文件时,我想将完整的“存档”写入单独文件夹中的文件。

例如,如果我从三个看起来像这样的 CSV 文件开始。

CSV 1

[]
[['Title'],['Date'],['etc']]
[]
[['Spam'],['01/01/2013'],['Spam is the spammiest spam']]
[['Ham'],['01/04/2013'],['ham is ok']]
[['Lamb'],['04/01/2013'],['Welsh like lamb']]
[['Sam'],['01/12/2013'],["Sam doesn't taste as good and the last three"]]

CSV 2

[]
[['Title'],['Date'],['etc']]
[]
[['Dolphin'],['01/01/2013'],['People might get angry if you eat it']]
[['Bear'],['01/04/2013'],['Best of Luck']]

CSV 3

[]
[['Title'],['Date'],['etc']]
[]
[['Spinach'],['04/01/2013'],['Spinach has lots of iron']]
[['Melon'],['02/06/2013'],['Not a big fan of melon']]

最后我会回家得到类似的东西......

CSV 输出

[['Spam'],['01/01/2013'],['Spam is the spammiest spam']]
[['Ham'],['01/04/2013'],['ham is ok']]
[['Lamb'],['04/01/2013'],['Welsh like lamb']]
[['Sam'],['01/12/2013'],["Sam doesn't taste as good and the last three"]]
[['Dolphin'],['01/01/2013'],['People might get angry if you eat it']]
[['Bear'],['01/04/2013'],['Best of Luck']]
[['Spinach'],['04/01/2013'],['Spinach has lots of iron']]
[['Melon'],['02/06/2013'],['Not a big fan of melon']]

所以......我开始写这个:

import os
import csv

path = './Path/further/into/file/structure'
directory_list = os.listdir(path)
directory_list.sort()

archive = []

for file_name in directory_list:
    temp_storage = []
    path_to = path + '/' + file_name
    file_data = open(path_to, 'r')
    file_CSV = csv.reader(file_data)
    for row in file_CSV:
        temp_storage.append(row)
    for row in temp_storage[3:-1]:
        archive.append(row)

archive_file = open("./Path/elsewhere/in/file/structure/archive.csv", 'wb')
wr = csv.writer(archive_file)
for row in range(len(archive)):
    lastrow = row
    wr.writerow(archive[row])
print row

这似乎有效......除了当我检查我的输出文件时,它似乎已经停止在接近末尾的一个奇怪点写入“

例如:

[['Spam'],['01/01/2013'],['Spam is the spammiest spam']]
[['Ham'],['01/04/2013'],['ham is ok']]
[['Lamb'],['04/01/2013'],['Welsh like lamb']]
[['Sam'],['01/12/2013'],['Sam doesn't taste as good and the last three']]
[['Dolphin],['01/01/2013'],['People might get angry if you eat it']]
[['Bear'],['01/04/2013'],['Best of Luck']]
[['Spinach'],['04/0

这真的很奇怪,我无法弄清楚出了什么问题。似乎写得很好,但决定在列表条目中途停止。追溯它,我确信这与我上次写的“for循环”有关,但我不太熟悉 csv 方法。已经阅读了文档,但我仍然很难过。

谁能指出我哪里出错了,我该如何解决它,也许有更好的方法来解决这一切!

非常感谢-Huw

4

1 回答 1

8

在脚本结束之前关闭文件句柄。关闭文件句柄也会刷新所有等待写入的字符串。如果您不刷新并且脚本结束,则某些输出可能永远不会被写入。

使用该with open(...) as f语法很有用,因为它会在 Python 离开with-suite 时为您关闭文件。使用with,您将永远不会忽略再次关闭文件。

with open("./Path/elsewhere/in/file/structure/archive.csv", 'wb') as archive_file:
    wr = csv.writer(archive_file)
    for row in archive:
        wr.writerow(row)
    print row
于 2013-04-02T19:12:19.677 回答