我有一个如图所示的表格,其中包含合并的单元格。如何在 python 中读取 Excel 表并保存在字典中?
table_dict={S1:[a,b,c,d],
S2:[[a1,a2,a3],[b1,b2,b3],[d1,d2,d3]],
S3:[[a4,a5,a6][b4,b5,b6][c4,c5,c6][d4,d5,d6]]}
不确定 excel 文档,但您可以使用csv模块读取 CSV 格式的文件。从文档:
>>> import csv
>>> with open('eggs.csv', 'rb') as csvfile:
... spamreader = csv.reader(csvfile, delimiter=' ', quotechar='|')
... for row in spamreader:
... print ', '.join(row)
Spam, Spam, Spam, Spam, Spam, Baked Beans
Spam, Lovely Spam, Wonderful Spam
使用从 LibreOffice 输出的文件进行测试(我在这台机器上没有 Excel),合并的单元格被拆分并用空白单元格填充(就好像它们一开始没有合并一样)。所以你会有类似的东西:
[['S1', 'S2', '', '', 'S3', '', ''],
['a', 'a1', 'a2', 'a3', 'a4', 'a5', 'a6'],
['b', 'b1', 'b2', 'b3', 'b4', 'b5', 'b6'],
... etc]
然后你只需要一个脚本来把它转换成你想要的格式。
import csv
from collections import defaultdict
with open('file.csv', 'rb') as csvfile:
reader = csv.reader(csvfile)
# rotate the data so we have a list of columns, not a list of rows
# note this is not very robust
data = zip(*reader)
results = defaultdict(list)
last = None
for col in data:
# pull the column name off the front
name = col[0]
cells = col[1:]
# use the previous column name if blank
if name is '':
name = last
# check for missing column name at start
if name is None:
print 'invalid data:', col
continue
results[name].append(cells)
last = name
print results
产量:
defaultdict(<type 'list'>, {
'S3': [('a4', 'b4', 'c4', 'd4'), ('a5', 'b5', 'c5', 'd5'), ('a6', 'b6', 'c6', 'd6')],
'S2': [('a1', 'b1', 'c1', 'd1'), ('a2', 'b2', 'c2', 'd2'), ('a3', 'b3', 'c3', 'd3')],
'S1': [('a', 'b', 'c', 'd')]})