python - 在 Python 中使用 xlrd 将数字 Excel 数据读取为文本

Question

我正在尝试使用 xlrd 读取 Excel 文件，我想知道是否有一种方法可以忽略 Excel 文件中使用的单元格格式，而只是将所有数据作为文本导入？

这是我目前使用的代码：

import xlrd

xls_file = 'xltest.xls'
xls_workbook = xlrd.open_workbook(xls_file)
xls_sheet = xls_workbook.sheet_by_index(0)

raw_data = [['']*xls_sheet.ncols for _ in range(xls_sheet.nrows)]
raw_str = ''
feild_delim = ','
text_delim = '"'

for rnum in range(xls_sheet.nrows):
    for cnum in range(xls_sheet.ncols):
        raw_data[rnum][cnum] = str(xls_sheet.cell(rnum,cnum).value)

for rnum in range(len(raw_data)):
    for cnum in range(len(raw_data[rnum])):
        if (cnum == len(raw_data[rnum]) - 1):
            feild_delim = '\n'
        else:
            feild_delim = ','
        raw_str += text_delim + raw_data[rnum][cnum] + text_delim + feild_delim

final_csv = open('FINAL.csv', 'w')
final_csv.write(raw_str)
final_csv.close()

此代码是有效的，但有某些字段（例如邮政编码）作为数字导入，因此它们具有十进制零后缀。例如，Excel 文件中是否有邮政编码“79854”，则将其导入为“79854.0”。

我曾尝试在此xlrd 规范中找到解决方案，但没有成功。

score 24 · Accepted Answer

这是因为 Excel 中的整数值在 Python 中作为浮点数导入。因此，sheet.cell(r,c).value返回一个浮点数。尝试将值转换为整数，但首先确保这些值在 Excel 中是整数：

cell = sheet.cell(r,c)
cell_value = cell.value
if cell.ctype in (2,3) and int(cell_value) == cell_value:
    cell_value = int(cell_value)

这一切都在xlrd 规范中。

score 4 · Accepted Answer

我知道这不是问题的一部分，但我会摆脱raw_str并直接写入您的 csv。对于大文件（10,000 行），这将节省大量时间。

您也可以摆脱raw_data并只使用一个 for 循环。

python - 在 Python 中使用 xlrd 将数字 Excel 数据读取为文本

2 回答 2

Related

Reference