0

I'm writing a python program to extract tables from excel sheets and pdf. Currently, I'm using different libraries for each file type. Xlrd for excel sheets, Pdfminer for pdf.

I'm wondering if there is a generic approach to extract tables from any type of file (xls, pdf, csv, word etc.). Since I'm planning to expand the list of supported file types, writing different functions for each file type would be cumbersome.

P.S. I came across PETL while looking for solutions. I could not find any excel/pdf extraction examples and I could not fully understand the documentation. Would PETL fulfill my requirement? If yes, I would really appreciate an example. Thank you.

4

0 回答 0