我正在尝试使用 pdfbox 为嵌入在 pdf 中的图像提取图像元数据(和图像)。我有以下问题,
for each page i:
for each image j in page i:
extract metadata, output
create the image file in separate thread
现在,我有以下用于创建图像文件的代码,该文件包含在实现类中调用的方法generate_image()
中。此方法从 调用。代码如下:FileWriting
Runnable
run()
try {
File F=new File(figurename);
item.getImage().write2file( F );
} catch (Exception e) {
e.printStackTrace();
}
whereitem.getImage()
返回一个PDXObjectImage
对象。如果我在不创建单独线程的情况下执行此操作,它可以正常工作,但是当我创建一个线程来执行此任务时,它会显示以下错误:
java.lang.IndexOutOfBoundsException: Index: 5, Size: 0
at java.util.ArrayList.rangeCheck(ArrayList.java:604)
at java.util.ArrayList.get(ArrayList.java:382)
at org.apache.pdfbox.io.RandomAccessBuffer.seek(RandomAccessBuffer.java:84)
at org.apache.pdfbox.io.RandomAccessFileInputStream.read(RandomAccessFileInputStream.java:96)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
at java.io.FilterInputStream.read(FilterInputStream.java:107)
at org.apache.pdfbox.pdmodel.graphics.xobject.PDCcitt$TiffWrapper.read(PDCcitt.java:468)
at org.apache.pdfbox.io.IOUtils.copy(IOUtils.java:68)
at org.apache.pdfbox.pdmodel.graphics.xobject.PDCcitt.write2OutputStream(PDCcitt.java:184)
at org.apache.pdfbox.pdmodel.graphics.xobject.PDXObjectImage.write2file(PDXObjectImage.java:165)
at extractor.FileWriting.generate_image(FileWriting.java:136)
谁能指出我哪里出错了?