python - 将 unicode 字符串写入 Excel 2007

Question

我正在使用pyodbc. 此外，我正在尝试.xlsx使用openpyxl.

这是我的代码（Python 2.7）：

import pyodbc
from openpyxl import Workbook

cnxn = pyodbc.connect(host = 'xxx',database='yyy',user='zzz',password='ppp')
cursor = cnxn.cursor()

sql = "SELECT TOP 10   [customer clientcode] AS Customer, \
                [customer dchl] AS DChl, \
                [customer name] AS Name, \
                ...
                [name3] AS [name 3] \
        FROM   mydb \
        WHERE [customer dchl] = '03' \
        ORDER BY [customer id] ASC"

#load data
cursor.execute(sql)

#get colnames from openpyxl
columns = [column[0] for column in cursor.description]    

#using optimized_write cause it will be about 120k rows of data
wb = Workbook(optimized_write = True, encoding='utf-8')

ws = wb.create_sheet()
ws.title = '03'

#append column names to header
ws.append(columns)

#append rows to 
for row in cursor:
    ws.append(row)

wb.save(filename = 'test.xlsx')

cnxn.close()

这有效，至少直到我遇到一个客户，例如，名称："mún"。我的代码没有失败，一切都写入 Excel，一切都很好。直到我真正打开 Excel 文件——这会导致错误提示文件已损坏，需要修复。修复文件后，所有数据都会丢失。

我知道该代码适用于具有常规名称（仅 ASCII）的客户，只要有重音字符或 Excel 文件损坏的任何内容。

我试图打印一行（具有困难的客户名称）。这是结果：

row是一个元组，这是一个索引：'Mee\xf9s Tilburg'所以要么写入\xf9 (ú)字符会导致错误，要么 MS Excel 无法处理它。我尝试了各种将行编码为 unicode（unicode(row,'utf-8')或u''.join(row)）等的方法，但没有任何效果。要么我尝试一些愚蠢的方法导致错误，要么 Excel 文件仍然错误。

有任何想法吗？

score 5 · Accepted Answer

最后我找到了两个解决方案：

第一个是将光标给定的行转换为列表，并解码列表中的元素：

for row in cursor:
    l = list(row)
    l[5] = l[5].decode('ISO-8859-1')
    (do this for all neccesary cols)
    ws.append(l)

I figured this would have been hell, cause there were 6 columns needing conversion to unicode, and there were 120k rows, though everything went quite fast actually! In the end it became apparent that I could/should just cast the data in the sql statement to unicode ( cast(x as nvarchar) AS y) which made the replacements unnecessary. I did not think of this at first cause i thought that it was actually supplying the data in unicode. My bad.

score -1 · Accepted Answer

You can use encode() to convert unicode to string:

l=[u'asd',u'qw',u'fdf',u'sad',u'sadasd']
l[4]=l[4].encode('utf8')

python - 将 unicode 字符串写入 Excel 2007

2 回答 2

Related

Reference