我正在尝试解析一个包含英语和印地语字符的 csv 文件,并且我正在使用 utf-16。它工作正常,但一旦它击中印地语字符,它就会失败。我在这里不知所措。
这是代码-->
import csv
import codecs
csvReader = csv.reader(codecs.open('/home/kuberkaul/Downloads/csv.csv', 'rb', 'utf-16'))
for row in csvReader:
print row
我得到的错误是 Traceback (最近一次通话最后一次):
> File "csvreader.py", line 8, in <module>
> for row in csvReader: UnicodeEncodeError: 'ascii' codec can't encode characters in position 11-18: ordinal not in range(128)
> kuberkaul@ubuntu:~/Desktop$
我该如何解决这个问题?
编辑1:
我尝试了解决方案并使用了 unicdoe csv 阅读器,现在它给出了错误:
UnicodeDecodeError:“ascii”编解码器无法解码位置 0 的字节 0xff:序数不在范围内(128)
代码是:
import csv
import codecs, io
def unicode_csv_reader(unicode_csv_data, dialect=csv.excel, **kwargs):
# csv.py doesn't do Unicode; encode temporarily as UTF-8:
csv_reader = csv.reader(utf_8_encoder(unicode_csv_data),
dialect=dialect, **kwargs)
for row in csv_reader:
# decode UTF-8 back to Unicode, cell by cell:
yield [unicode(cell, 'utf-8') for cell in row]
def utf_8_encoder(unicode_csv_data):
for line in unicode_csv_data:
yield line.encode('utf-8')
filename = '/home/kuberkaul/Downloads/csv.csv'
reader = unicode_csv_reader(codecs.open(filename))
print reader
for rows in reader:
print rows