1

I'm having a pdf encoded in a strange codification that I cannot read.

This is an example of an object stream when a I'm reading the buffer:

BT 1 0 0 -1 9670 5386 Tm (.&RY!) Tj 610 0 Td (.&R%!) Tj 570 0 Td (.%R$!) Tj -10310 -244 Td (KSAK4UOH^.]SKHFS.@SKHF^S.H]) Tj 5954 0 Td (!V) Tj -961 0 Td (!&#!%#%!!") Tj 1356 0 Td (&!!) Tj -2722 0 Td (&.!!!!!'%W!$&&"b) Tj ET

I tried to uncompress with pdftk and qpdf but it did not work.

It looks like is encrypted, but when I do qpdf --show-encryption file.pdf, it says: "file is not encrypted".

When I use pdftotext file.pdf output.txt, I can read the output file perfectly, so that makes me think it has to be a special codification...

Any suggestions?

4

1 回答 1

1

您的 PDF 使用 CMAP 编码 - http://blog.idrsolutions.com/2012/05/understanding-the-pdf-file-format-embedded-cmap-tables/

于 2013-06-20T06:59:42.043 回答