0

我正在尝试使用 Camelot 从 pdf 中提取表格数据。使用参数“table_regions”时,我得到并错误“解包的值太多(预期为 4)”

tables = camelot.read_pdf('BOA1.pdf',flavor="stream",pages="3",table_regions=['1,1,1,1'])

导致:

    ValueError                                Traceback (most recent call last)
<ipython-input-154-681440b4cbbd> in <module>()
----> 1 tables = camelot.read_pdf('BOA1.pdf',flavor="stream",pages="3-20",table_regions=['1,1,1,1'])

~\Anaconda3\lib\site-packages\camelot\io.py in read_pdf(filepath, pages, password, flavor, suppress_stdout, layout_kwargs, **kwargs)
    104         kwargs = remove_extra(kwargs, flavor=flavor)
    105         tables = p.parse(flavor=flavor, suppress_stdout=suppress_stdout,
--> 106                          layout_kwargs=layout_kwargs, **kwargs)
    107         return tables

~\Anaconda3\lib\site-packages\camelot\handlers.py in parse(self, flavor, suppress_stdout, layout_kwargs, **kwargs)
    160             for p in pages:
    161                 t = parser.extract_tables(p, suppress_stdout=suppress_stdout,
--> 162                                           layout_kwargs=layout_kwargs)
    163                 tables.extend(t)
    164         return TableList(tables)

~\Anaconda3\lib\site-packages\camelot\parsers\stream.py in extract_tables(self, filename, suppress_stdout, layout_kwargs)
    417             return []
    418 
--> 419         self._generate_table_bbox()
    420 
    421         _tables = []

~\Anaconda3\lib\site-packages\camelot\parsers\stream.py in _generate_table_bbox(self)
    287                 hor_text = []
    288                 for region in self.table_regions:
--> 289                     x1, y1, x2, y2 = region
    290                     region_text = text_in_bbox((x1, y2, x2, y1), self.horizontal_text)
    291                     hor_text.extend(region_text)

ValueError: too many values to unpack (expected 4)
4

2 回答 2

0

这可能是因为 table_region = [1, 1,1 ,1] 代表一个点而不是一个区域。

于 2019-09-12T10:35:59.350 回答
0

这是一个已知的错误。他们将修复它。

https://github.com/socialcopsdev/camelot/issues/312

于 2019-05-06T07:06:23.430 回答