我正在阅读 PDF 并输出包含多个原始 PDF 副本的 PDF。我通过对PDFBox和iText做同样的事情来进行测试。如果我单独复制每个页面,iText 创建的输出要小得多。
问题:在 PDFBox 中是否有另一种方法可以产生更小的输出 PDF。
对于一个示例输入文件,使用两种工具生成输出的两个副本:
- 原始 PDF 大小:30K
- PDFBox (v 1.7.1) 生成的 PDF:84K
- iText (v 5.3.4) 生成的 PDF:35K
PDFBox 的 Java 代码(抱歉给您造成错误处理)。请注意它如何一遍又一遍地读取输入并将其作为一个整体进行复制:
PDFMergerUtility merger = new PDFMergerUtility();
PDDocument workplace = null;
try {
for (int cnt = 0; cnt < COPIES; ++cnt) {
PDDocument document = null;
InputStream stream = null;
try {
stream = new FileInputStream(new File(sourceFileName));
document = PDDocument.load(stream);
if (workplace == null) {
workplace = document;
} else {
merger.appendDocument(workplace, document);
}
} finally {
if (document != null && document != workplace) {
document.close();
}
if (stream != null) {
stream.close();
}
}
}
OutputStream out = null;
try {
out = new FileOutputStream(new File(destinationFileName));
workplace.save(out);
} finally {
if (out != null) {
out.close();
}
}
} catch (COSVisitorException e1) {
e1.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} finally {
if (workplace != null) {
try {
workplace.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
使用 iText 执行此操作的代码。注意它是如何逐页加载输入文件并将每一页传输到输出的:
Document document = null;
PdfReader reader = null;
InputStream inputStream = null;
FileOutputStream outputStream = null;
try {
inputStream = new FileInputStream(new File(sourceFileName));
outputStream = new FileOutputStream(new File(destinationFileName));
document = new Document();
PdfCopy copy = new PdfSmartCopy(document, outputStream);
document.open();
reader = new PdfReader(inputStream);
// loop over the pages in that document
int pdfPageNo = reader.getNumberOfPages();
for (int page = 0; page < pdfPageNo;) {
PdfImportedPage onePage = copy.getImportedPage(reader, ++page);
// duplicate each page N times
for (int i = 0; i < COPIES; ++i) {
copy.addPage(onePage);
}
}
copy.freeReader(reader);
} catch (DocumentException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} finally {
if (reader != null) {
reader.close();
}
if (document != null) {
document.close();
}
try {
if (inputStream != null) {
inputStream.close();
}
if (outputStream != null) {
outputStream.close();
}
} catch (IOException e) {
// do nothing
}
}
两者都被这个包围:
public class Duplicate {
/** The original PDF file */
private static final String sourceFileName = "PDF_CI_US2CA.pdf";
/** The resulting PDF file. */
private static final String destinationFileName = "itext_output.pdf";
private static final int COPIES = 2;
public static void main(String[] args) {
...
}
}