4
4

3 回答 3

6

The PDF Reference states that the Title entry in the document info dictionary is of type 'text string'. Text strings are defined as using either PDFDocEncoding or UTF-16BE with a Byte Order Mark (see page 158 of the 1.7 PDF Reference Manual).

So you cannot specify a Title using UTF-8 without a BOM.

I would imagine that if you replace the Title string with a string defining the content using UTF-16BE with a BOM then it will work properly. I would suggest you use a hex string rather than a regular PostScript string to specify the data, simply for ease of use.

于 2012-02-08T09:10:01.960 回答
2

Using the idea from Happyman Chiu my solution is next. Get a UTF-16BE string with BOM by

echo -n '(敏捷开发)' | iconv -t utf-16 |od -x -A none | tr -d ' \n' | sed 's/./\U&/g;s/^/</;s/$/>/'

You will get <FEFF0028654F63775F0053D10029>. Substitute this for title.

/Title <FEFF0028654F63775F0053D10029>
于 2018-11-15T06:10:49.097 回答
0

follow pdfmark for docinfo metadata in pdf is not accepting accented characters in Keywords or Subject

I use this function to create the string from utf-8 for info.txt, to be used by gs command.

  function str_in_pdf($str){
    $cmd = sprintf("echo '%s'| iconv -t utf-16 |od -x -A none",$str);
    exec($cmd,$out,$ret);
    return "<" . implode("",$out) .">";
  }
于 2013-06-25T08:37:46.710 回答