5

I have a repository of PDF documents, and most of the text contained in these documents are formatted in Comic Sans. I would like to change this to something similar to Arial. The original font is embedded in the document. I haven't found any existing tool to do this for me (I'm on Linux), and I wonder if it's possible to do it programmaticaly. A Python library would be perfect, but a library in any programming language would do.

In which library will I be able to substitute fonts with the least effort? And which parts of the API would I use?

4

1 回答 1

1

有一些商业工具可以做到这一点——其中之一是 callas 软件的 pdfToolbox (警告——我隶属于这家公司)

然而——即使这个功能存在并且有时被使用——结果通常是完全不受欢迎的,而且我还没有看到很多上下文中它被用于非常特定的文件。而且通常成功有限。以至于这种替换只能在我提到的工具中作为手动操作使用——而不是在自动模式下。

根据这些文件的复杂程度,您可能会更成功地将文档中的所有文本提取到诸如 RTF 之类的东西中,在那里进行您需要进行的任何操作,然后重新生成 PDF。听起来像一个迂回的方式,但我猜在大多数情况下结果会更好......

于 2012-12-02T21:57:19.617 回答