我已经使用chardet来识别csv文件的编码类型,代码如下:
with open(file, 'rb') as rawdata:
result = chardet.detect(rawdata.read(30000))
result
它显示输出: {'confidence': 1.0, 'encoding': 'ascii', 'language': ''}
但是当我使用下面的代码读取相同的文件时:
pd.read_csv(file_path, encoding='ascii')
我得到错误:
UnicodeDecodeError Traceback (most recent call last)
<ipython-input-26-a7744924e4c8> in <module>()
11 df = pd.read_csv('https://query.data.world/s/vzbpbsxivczilaqqhue4nnf66un7p2',
12 usecols=['product_title', 'product_description'],
---> 13 encoding="ascii", error_bad_lines=False)
14 df.shape
4 frames
/usr/local/lib/python3.7/dist-packages/pandas/io/parsers.py in __init__(self, src, **kwds)
2008 kwds["usecols"] = self.usecols
2009
-> 2010 self._reader = parsers.TextReader(src, **kwds)
2011 self.unnamed_cols = self._reader.unnamed_cols
2012
pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader.__cinit__()
pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._get_header()
pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._tokenize_rows()
pandas/_libs/parsers.pyx in pandas._libs.parsers.raise_parser_error()
UnicodeDecodeError: 'ascii' codec can't decode byte 0x89 in position 56041: ordinal not in range(128)`
有什么帮助吗?