0

我已经使用chardet来识别csv文件的编码类型,代码如下:

with open(file, 'rb') as rawdata:
   result = chardet.detect(rawdata.read(30000))

result

它显示输出: {'confidence': 1.0, 'encoding': 'ascii', 'language': ''}

但是当我使用下面的代码读取相同的文件时:

pd.read_csv(file_path, encoding='ascii')

我得到错误:

UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-26-a7744924e4c8> in <module>()
     11 df = pd.read_csv('https://query.data.world/s/vzbpbsxivczilaqqhue4nnf66un7p2',
     12                  usecols=['product_title', 'product_description'],
---> 13                  encoding="ascii", error_bad_lines=False)
     14 df.shape

4 frames
/usr/local/lib/python3.7/dist-packages/pandas/io/parsers.py in __init__(self, src, **kwds)
   2008         kwds["usecols"] = self.usecols
   2009 
-> 2010         self._reader = parsers.TextReader(src, **kwds)
   2011         self.unnamed_cols = self._reader.unnamed_cols
   2012 

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader.__cinit__()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._get_header()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._tokenize_rows()

pandas/_libs/parsers.pyx in pandas._libs.parsers.raise_parser_error()

UnicodeDecodeError: 'ascii' codec can't decode byte 0x89 in position 56041: ordinal not in range(128)`

有什么帮助吗?

4

0 回答 0