1

我正在尝试使用 utf-8 文本格式加载 .csv 文件,并将其写入带有管道分隔符的 cp1252(ansi) 格式。以下代码在 Python 3.6 中工作,但我需要它在 Python 2.6 中工作。但是,'open' 函数在 Python 2.6 中不允许使用编码关键字。

import datetime
import csv

# Define what filenames to read
filenames = ["FILE1","FILE2"]
infilenames = [filename+".csv" for filename in filenames]
outfilenames = [filename+"_out_.csv" for filename in filenames]

# Read filenames in utf-8 and write them in cp1252
for infilename,outfilename in zip(infilenames,outfilenames):
    infile  = open(infilename, "rt",encoding="utf8")
    reader = csv.reader(infile,delimiter=',',quotechar='"',quoting=csv.QUOTE_MINIMAL)

    outfile  = open(outfilename, "wt",encoding="cp1252")
    writer = csv.writer(outfile, delimiter='|', quotechar='"', quoting=csv.QUOTE_NONE,escapechar='\\')  
    for row in reader:
        writer.writerow(row)    

infile.close()
outfile.close()

我尝试了几种解决方案:

  • 没有定义编码。导致某些 unicode 字符出错
  • 使用 io 库(io.open 而不是 open)。导致“类型错误:无法将 str 写入文本流中的文本”。

有谁知道 Python 2.X 中的正确解决方案?

4

1 回答 1

1

这里可能有一些冗余代码,但我通过执行以下操作使其工作:

  • 首先,我使用 .decode 和 .encode 函数进行编码,使其成为“cp1252”。
    • 然后我从 cp1252 编码文件中读取 csv 并将其写入新的 csv

...

import datetime
import csv

# Define what filenames to read
filenames = ["FILE1","FILE2"]


infilenames = [filename+".csv" for filename in filenames]
outfilenames = [filename+"_out_.csv" for filename in filenames]
midfilenames = [filename+"_mid_.csv" for filename in filenames]

# Iterate over each file
for infilename,outfilename,midfilename in zip(infilenames,outfilenames,midfilenames):

    # Open file and read utf-8 text, then encode in cp1252
    infile  = open(infilename, "r") 
    infilet = infile.read()
    infilet = infilet.decode("utf-8")
    infilet = infilet.encode("cp1252","ignore")

    #write cp1252 encoded file
    midfile = open(midfilename,"w")
    midfile.write(infilet)
    midfile.close()

    # read csv with new cp1252 encoding
    midfile = open(midfilename,"r")
    reader = csv.reader(midfile,delimiter=',', quotechar='"',quoting=csv.QUOTE_MINIMAL)

    # define output
    outfile  = open(outfilename, "w")
    writer = csv.writer(outfile, delimiter='|', quotechar='"',quoting=csv.QUOTE_NONE,escapechar='\\')

    #write output to new csv file
    for row in reader:
        writer.writerow(row)

    print("written file",outfilename)
    infile.close()
    midfile.close()
    outfile.close()
于 2017-08-10T11:39:05.503 回答