python - 使用 Python 的内置 .csv 模块编写

Question

[请注意，这与已经回答的如何使用 Python 的内置 .csv 编写器模块替换列？]

我需要在一个巨大的 Excel .csv 文件中进行查找和替换（特定于一列 URL）。由于我正处于尝试自学脚本语言的开始阶段，我想我会尝试在 python 中实现该解决方案。

在更改条目的内容后尝试写回 .csv 文件时遇到问题。我已阅读有关如何使用编写器的官方 csv 模块文档，但没有涵盖这种情况的示例。具体来说，我试图在一个循环中完成读取、替换和写入操作。但是，不能在 for 循环的参数和 writer.writerow() 的参数中使用相同的“行”引用。那么，一旦我在 for 循环中进行了更改，我应该如何写回文件？

编辑：我实施了 S. Lott 和 Jimmy 的建议，结果仍然相同

编辑#2：根据 S. Lott 的建议，我在 open() 函数中添加了“rb”和“wb”

import csv

#filename = 'C:/Documents and Settings/username/My Documents/PALTemplateData.xls'

csvfile = open("PALTemplateData.csv","rb")
csvout = open("PALTemplateDataOUT.csv","wb")
reader = csv.reader(csvfile)
writer = csv.writer(csvout)

changed = 0;

for row in reader:
    row[-1] = row[-1].replace('/?', '?')
    writer.writerow(row)                  #this is the line that's causing issues
    changed=changed+1

print('Total URLs changed:', changed)

编辑：供您参考，这是来自解释器的新的完整回溯：

Traceback (most recent call last):
  File "C:\Documents and Settings\g41092\My Documents\palScript.py", line 13, in <module>
    for row in reader:
_csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)

score 10 · Accepted Answer

您不能读取和写入同一个文件。

source = open("PALTemplateData.csv","rb")
reader = csv.reader(source , dialect)

target = open("AnotherFile.csv","wb")
writer = csv.writer(target , dialect)

所有文件操作的正常方法是创建原始文件的修改副本。不要尝试就地更新文件。这只是一个糟糕的计划。

编辑

在行中

source = open("PALTemplateData.csv","rb")

target = open("AnotherFile.csv","wb")

“rb”和“wb”是绝对需要的。每次忽略这些时，都会以错误的格式打开文件进行读取。

您必须使用“rb”来读取 .CSV 文件。Python 2.x 没有选择。在 Python 3.x 中，您可以省略它，但显式使用“r”以使其清晰。

您必须使用“wb”来编写 .CSV 文件。Python 2.x 没有选择。对于 Python 3.x，您必须使用“w”。

编辑

看来您正在使用 Python3。您需要从“rb”和“wb”中删除“b”。

阅读：http ://docs.python.org/3.0/library/functions.html#open

score 4 · Accepted Answer

以二进制形式打开 csv 文件是错误的。CSV 是普通的文本文件，所以你需要打开它们

source = open("PALTemplateData.csv","r")
target = open("AnotherFile.csv","w")

错误

_csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)

来是因为您以二进制模式打开它们。

当我用 python 打开 excel csv 时，我使用了类似的东西：

try:    # checking if file exists
    f = csv.reader(open(filepath, "r", encoding="cp1250"), delimiter=";", quotechar='"')
except IOError:
    f = []

for record in f:
    # do something with record

它工作得相当快（我打开了两个大约 10MB 的 csv 文件，虽然我是用 python 2.6 做的，而不是 3.0 版本）。

在 python 中处理 excel csv 文件的工作模块很少——pyExcelerator就是其中之一。

score 2 · Accepted Answer

问题是你试图写入你正在读取的同一个文件。写入不同的文件，然后在删除原始文件后重命名。

python - 使用 Python 的内置 .csv 模块编写

3 回答 3

Related

Reference