好的,伙计,我认为答案是我为这个愚蠢的包装器xlrd
(或者,你自己写的一个!)。关键是该函数一次将一行读取到一个列表中,并且 Python 列表会记住它们被填充的顺序。包装器生成一个字典,它将 Excel 工作表名称映射到该工作表上的行列表(我们假设这里每张工作表一个表,否则您必须概括事物)。每行都是一个字典,其键是列名。
对你来说,我会读入你的数据,然后做这样的事情(未经测试):
import see_below as sb
dict = sb.workbookToDict(your_file)
output = []
this_location = None
for row in dict[relevant_sheet_name]:
output_row = row
if row['Location'] is not None:
this_location = row['Location']
else:
output_row['Location'] = this_location
你可以用列表理解做一些可爱的事情,但今晚我喝了太多酒来愚弄它:)
这是读者的包装:
import xlrd
def _isEmpty(_):
return ''
def _isString(element):
return element.value.encode('ascii', 'ignore')
def _isFloat(element):
return float(element.value)
def _isDate(element):
import datetime
rawDate = float(element.value)
return (datetime.datetime(1899, 12, 30) +
datetime.timedelta(days=rawDate))
def _isBool(element):
return element.value == 1
def _isExcelGarbage(element):
return int(element.value)
_options = {0: _isEmpty,
1: _isString,
2: _isFloat,
3: _isDate,
4: _isBool,
5: _isExcelGarbage,
6: _isEmpty}
def WorkbookToDict(filename):
'''
Reads .xlsx file into dictionary.
The keys of the dictionary correspond to sheet names in the Excel workbook.
The first row of the Excel workbook is taken to be column names, and each row
of the worksheet is read into a separate dictionary, whose keys correspond to
column names. The collection of dictionaries (as a list) forms the value in the
dictionary. The output maps sheet names (keys) to a collection of dictionaries
(value).
'''
book = xlrd.open_workbook(filename)
allSheets = {}
for s in book.sheets():
thisSheet = []
headings = [_options[x.ctype](x) for x in s.row(0)]
for i in range(s.nrows):
if i == 0:
continue
thisRow = s.row(i)
if len(thisRow) != len(headings):
raise Exception("Mismatch between headings and row length in ExcelReader")
rowDict = {}
for h, r in zip(headings, thisRow):
rowDict[h] = _options[r.ctype](r)
thisSheet.append(rowDict)
allSheets[str(s.name)] = thisSheet
return allSheets
作者在这里:
import xlwt
def write(workbookDict, colMap, filename):
'''
workbookDict should be a map of sheet names to a list of dictionaries.
Each member of the list should be a mapping of column names to contents,
missing keys are handled with the nullEntry field. colMap should be a
dictionary whose keys are identical tto the sheet names in the workbookDict.
Each value is a list of column names that are assumed to be in order.
If a key exists in the workbookDict that does not exist in the colDict, the
entry in workbookDict will not be written.
'''
workbook = xlwt.Workbook()
for sheet in workbookDict.keys():
worksheet = workbook.add_sheet(sheet)
cols = colMap[sheet]
i = 0
writeCols = True
while i <= len(workbookDict[sheet]):
if writeCols:
for j in range(len(cols)):
if writeCols: # write col headings
worksheet.write(i, j, cols[j])
writeCols = False
else:
for j in range(len(cols)):
worksheet.write(i, j, workbookDict[sheet][(i-1)][cols[j]])
i += 1
workbook.save(filename)
无论如何,我真的希望这对你有用!