itext - PdfUtilities.convertPdf2Png Create automatic images in My directory

Question

I've written some code to perform OCR on a PDF using Tesseract (Tess4J):

public void DoOCRAnalyse(String From) throws FileNotFoundException {
    Tesseract instance = Tesseract.getInstance();  // JNA Interface Mapping
    File[] files=PdfUtilities.convertPdf2Png(new File(From));       
    for (File f:files) {
        try {
            String result = instance.doOCR(f);
            /*String result = instance.doOCR(take File or BufferedImage); */
            SearchForSVHC(result,SvhcList);
        } catch (TesseractException e) {
            System.err.println(e.getMessage());
        }
    }
}

It recognizes text, which is great, but my problem is that it needs the images to be in a directory on disk. How can I pass a BufferedImage or File to the methode doOCR() without needing the files on disk?

score 1 · Accepted Answer

您正在将一个File对象传递给doOCR. 当您调用时convertPdf2Png，它会调用GhostScript将 PDF 文件转换为一个或多个 PNG 文件。如果需要，您当然可以在 OCR 之后删除它们，例如，通过f.Delete()在finally块中执行。

itext - PdfUtilities.convertPdf2Png Create automatic images in My directory

1 回答 1

Related

Reference