有一种非常 HACKY 的方法来提取数据,但它只适用于旧版本的 ghostscript,如 8.51 或 8.62。在旧版本的 ghostscript 中,PDF 命令定义在 /lib/pdf_ops.ps 中,新版本做了别的事情。
此处提供了经过测试的 8.62 版本。
http://sourceforge.net/projects/ghostscript/files/GPL%20Ghostscript/8.62/gs862w32.exe/download
使用/Tj {} def
并通过在每个定义的开头/TJ {} def
添加 a来打印您所追求的文本。dup ==
(这可以变得更复杂)我也不必担心字体警告消息,但如果将数据写入文件,这些将被过滤掉。
由于正在执行字距调整,因此有些单词被分成几部分和单独的字母。给定时间,这也可以被过滤。
修改 /Tj from pdf_ops.ps /Tj { dup == 0 0 moveto Show settextposition } bdef
从 pdf_ops.ps 修改 /TJ
/TJ { dup ==
0 0 moveto {
dup type /stringtype eq {
Show
} { -1000 div
currentfont /ScaleMatrix .knownget { 0 get mul } if
0 Vexch rmoveto
} ifelse
} forall settextposition
} bdef
输出
(Help a neighbor within your county each month by contributing to The Salvation )
(Army's Project SHARE and Georgia Power will match your gift. To help, simply check )
($1, $2, $5, or $10 on the return portion of this bill. Starting next month, your pledge )
(amount will be included on your monthly bill.)
(Our business offices will be closed on December 24 and 25 for Christmas and January )
(1 for New Year's Day. In case of an emergency, please call us at the number on your )
(bill 24 hours a day, 7 days a week.)
(PLEASE KEEP THIS PORTION FOR YOUR RECORDS.)
(PLEASE RETURN THIS PORTION WITH YOUR PAYMENT, MAKING SURE THE RETURN ADDRESS SHOWS IN THE ENVELOPE WINDOW.)
(Account Number)
(Mail To:)
后记不好玩吗?