当我尝试使用 Tabula 将 PDF 文件转换为 CSV 时,我得到一个空白选项卡。我想将 PDF 的特定页面转换为 .csv 格式。我收到以下错误:
Got stderr: Oct 29, 2021 3:29:30 PM org.apache.pdfbox.pdmodel.font.FileSystemFontProvider loadDiskCache
WARNING: New fonts found, font cache will be re-built
Oct 29, 2021 3:29:30 PM org.apache.pdfbox.pdmodel.font.FileSystemFontProvider <init>
WARNING: Building on-disk font cache, this may take a while
Oct 29, 2021 3:29:30 PM org.apache.pdfbox.pdmodel.font.FileSystemFontProvider <init>
WARNING: Finished building on-disk font cache, found 372 fonts
我的代码:
df = tabula.read_pdf('10iHP.pdf', pages = 'all')
tabula.convert_into("10iHP.pdf", "10iHP.csv", output_format="csv", pages='1')