python - 在循环浏览目录时需要一些帮助来解决 Python 中的 DBF“找不到文件”错误？

Question

我想就应该循环遍历驱动器上的目录的 Python 脚本寻求帮助。基本上，我想做的是将超过 10,000 个 DBF 文件转换为 CSV。到目前为止，我可以通过使用dbfread和Pandas包在单个dbf文件上实现这一点。运行这个脚本超过 10,000 次显然是不可行的，因此为什么我想通过编写一个循环遍历目录中每个dbf文件的脚本来自动执行任务。

这是我想做的。

定义目录
编写一个 for 循环，将遍历目录中的每个文件
仅打开扩展名为“.dbf”的文件
转换为 Pandas 数据框
定义输出文件的名称
写入 CSV 并将文件放在新目录中

这是我用来测试是否可以将单个“.dbf”文件转换为 CSV 的代码。

from dbfread import DBF
import pandas as pd

table = DBF('Name_of_File.dbf')

#I originally kept receiving a unicode decoding error
#So I manually adjusted the attributes below

table.encoding = 'utf-8' # Set encoding to utf-8 instead of 'ascii'

table.char_decode_errors = 'ignore' #ignore any decode errors while reading in the file

frame  = pd.DataFrame(iter(table)) #Convert to DataFrame

print(frame) #Check to make sure Dataframe is structured proprely 

frame.to_csv('Name_of_New_File')

上面的代码完全按照预期工作。

这是我遍历目录的代码。

import os
from dbfread import DBF
import pandas as pd

directory = 'Path_to_diretory'

dest_directory = 'Directory_to_place_new_file'

for file in os.listdir(directory):

    if file.endswith('.DBF'):
        print(f'Reading in {file}...')
        dbf = DBF(file)
        dbf.encoding = 'utf-8'
        dbf.char_decode_errors = 'ignore'
        print('\nConverting to DataFrame...')
        frame = pd.DataFrame(iter(dbf))
        print(frame)
        outfile = frame.os.path.join(frame + '_CSV' + '.csv')
        print('\nWriting to CSV...')
        outfile.to_csv(dest_directory, index = False)
        print('\nConverted to CSV. Moving to next file...')

    else:
        print('File not found.')

当我运行此代码时，我收到一个DBFNotFound错误，指出它在目录中找不到第一个文件。当我查看我的代码时，我不确定为什么它在第一个脚本中工作时会发生这种情况。

这是引发异常的dbfread包中的代码。

 class DBF(object):
   """DBF table."""
    def __init__(self, filename, encoding=None, ignorecase=True,
             lowernames=False,
             parserclass=FieldParser,
             recfactory=collections.OrderedDict,
             load=False,
             raw=False,
             ignore_missing_memofile=False,
             char_decode_errors='strict'):

        self.encoding = encoding
        self.ignorecase = ignorecase
        self.lowernames = lowernames
        self.parserclass = parserclass
        self.raw = raw
        self.ignore_missing_memofile = ignore_missing_memofile
        self.char_decode_errors = char_decode_errors

        if recfactory is None:
            self.recfactory = lambda items: items
        else:
            self.recfactory = recfactory

    # Name part before .dbf is the table name
        self.name = os.path.basename(filename)
        self.name = os.path.splitext(self.name)[0].lower()
        self._records = None
        self._deleted = None

        if ignorecase:
            self.filename = ifind(filename)
        if not self.filename:
            **raise DBFNotFound('could not find file {!r}'.format(filename))** #ERROR IS HERE
        else:
            self.filename = filename

感谢您提供的任何帮助。

score 1 · Accepted Answer

os.listdir返回目录中的文件名，因此您必须将它们加入基本路径才能获得完整路径：

for file_name in os.listdir(directory):
    if file_name.endswith('.DBF'):
        file_path = os.path.join(directory, file_name)
        print(f'Reading in {file_name}...')
        dbf = DBF(file_path)

python - 在循环浏览目录时需要一些帮助来解决 Python 中的 DBF“找不到文件”错误？

1 回答 1

Related

Reference