2

尝试从文件中提取 .xlsx 文档并将数据编译到单个工作表中。

尽管文件存在,但仍收到 IOError

程序如下

#-------------- loop that pulls in files from folder--------------
import os

#create directory from which to pull the files
rootdir = r'C:\Users\username\Desktop\Mults'

for subdir, dir, files in os.walk(rootdir):
for file in files:
    print os.path.join(subdir,file)
#----------------------merge work books-----------------------

import xlrd
import xlsxwriter


wb = xlsxwriter.Workbook('merged.xls')
ws = wb.add_worksheet()
for file in files:
    r = xlrd.open_workbook(file)
    head, tail = os.path.split(file)
    count = 0
    for sheet in r:
        if sheet.number_of_rows()>0:
            count += 1
    for sheet in r:
        if sheet.number_of_rosw()>0:
            if count == 1:
                sheet_name = tail
            else:
                sheet_name = "%s_%s" (tail, sheet.name)
            new_sheet = wb.create_sheet(sheet_name)
            new_sheet.write_reader(sheet)
            new_sheet.close()
wb.close()

返回错误如下

doc1.xlsx
doc2.xlsx
doc3.xlsx
doc4.xlsx

Traceback (most recent call last):
  File "C:\Users\username\Desktop\Work\Python\excel practice\xlsx - loops files - 09204.py", line 23, in <module>
    r = xlrd.open_workbook(file)
  File "C:\Python27\lib\site-packages\xlrd\__init__.py", line 394, in open_workbook
    f = open(filename, "rb")
IOError: [Errno 2] No such file or directory: 'doc1.xlsx'

有什么建议或改变吗?

另外,如果我朝着正确的方向前进,有什么建议吗?

我是python世界的新手,所以任何建议都将不胜感激!

谢谢!!

4

2 回答 2

4

您正在打开没有路径的普通文件名;您忽略了目录组件。

不要只打印os.path.join()结果,实际使用它:

filename = os.path.join(subdir, file) 
r = xlrd.open_workbook(filename)
于 2014-09-29T21:41:55.023 回答
0

对于第一个问题...

代替:

r = xlrd.open_workbook(file)

利用:

r = xlrd.open_workbook(os.path.join(subdir,file))

对于 TypeError:而不是:

for sheet in r:
    if sheet.number_of_rows()>0:
        count += 1

利用:

for nsheet in r.sheet_names() #you need a list of sheet names to loop throug
    sheet = r.sheet_by_name(nsheet) #then you create a sheet object with each name in the list
    if sheet.nrows>0: #use the property nrows of the sheet object to count the number of rows
        count += 1

对第二个 for 循环执行相同的操作。

于 2014-10-01T12:35:39.113 回答