7

这是我所做的:

  1. 在亚马逊云中创建了一个 linux 虚拟机。
  2. 按照https://code.google.com/p/wkhtmltopdf/wiki/compilation中的说明下载和编译 wkhtmltopdf-qt 和 wkhtmltopdf 的源代码。最后我有一个 wkhtmltopdf 的静态构建。
  3. 拿了这个 html ( http://jsfiddle.net/mark69_fnd/8CtjB/ ):

    <html> <head> <style type="text/css">p{font-family: sans-serif;};</style> </head> <body> <p>让我们测试一下</p> </正文> </html>

  4. wkhtmltopdf test.html test.pdf

  5. 将 test.pdf 复制到我的 Windows 桌面,打开它并得到这个(https://docs.google.com/file/d/0B2pbsdBJxJI3MV8zby14cGk5VWs/edit?usp=sharing): 在此处输入图像描述

../wkhtmltopdf/static_qt_conf_base我密切关注指南,qt 配置选项取自 ../wkhtmltopdf/static_qt_conf_linux指南建议。

不用说,我对结果有点失望。谁能解释我做错了什么?

附言

实际上,我需要转换一个更复杂的 HTML,但是当我无法转换一个微不足道的 HTML 时,谈论它是没有意义的。

编辑

我想强调一下,我不在 Linux 上工作,我只打开一个终端到亚马逊托管的 Linux 盒子。意思是,我没有 X11 环境。

这是我尝试使用预定义的 wkhtmltopdf 包时得到的结果:

ubuntu@ip-10-245-78-162:~$ which wkhtmltopdf
ubuntu@ip-10-245-78-162:~$ /usr/bin/wkhtmltopdf
-bash: /usr/bin/wkhtmltopdf: No such file or directory
ubuntu@ip-10-245-78-162:~$ sudo apt-get install wkhtmltopdf
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following NEW packages will be installed:
  wkhtmltopdf
0 upgraded, 1 newly installed, 0 to remove and 120 not upgraded.
Need to get 0 B/104 kB of archives.
After this operation, 303 kB of additional disk space will be used.
Selecting previously unselected package wkhtmltopdf.
(Reading database ... 36679 files and directories currently installed.)
Unpacking wkhtmltopdf (from .../wkhtmltopdf_0.9.9-3_amd64.deb) ...
Processing triggers for man-db ...
Setting up wkhtmltopdf (0.9.9-3) ...
ubuntu@ip-10-245-78-162:~$ l test.*
-rw-r--r-- 1 ubuntu ubuntu 123 Mar 30 12:46 test.html
ubuntu@ip-10-245-78-162:~$ cat test.html
<html> <head> <style type="text/css">p{font-family: sans-serif;};</style> </head> <body> <p>Let's Test</p> </body> </html>
ubuntu@ip-10-245-78-162:~$ /usr/bin/wkhtmltopdf test.html test.pdf
wkhtmltopdf: cannot connect to X server
ubuntu@ip-10-245-78-162:~$

编辑2

  1. 我已经下载了 ftp://rpmfind.net/linux/fedora/linux/development/rawhide/x86_64/os/Packages/u/urw-fonts-2.4-14.fc19.noarch.rpm
  2. 按照http://www.howtogeek.com/howto/ubuntu/install-an-rpm-package-on-ubuntu-linux/的说明将 rpm 转换为 deb 格式。
  3. 安装了deb
  4. 生成pdf,但仍然只看到正方形。

这是成绩单:

ubuntu@ip-10-245-78-162:~$ sudo alien urw-fonts-2.4-14.fc19.noarch.rpm --scripts
warning: urw-fonts-2.4-14.fc19.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID fb4b18e6: NOKEY
warning: urw-fonts-2.4-14.fc19.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID fb4b18e6: NOKEY
warning: urw-fonts-2.4-14.fc19.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID fb4b18e6: NOKEY
warning: urw-fonts-2.4-14.fc19.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID fb4b18e6: NOKEY
warning: urw-fonts-2.4-14.fc19.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID fb4b18e6: NOKEY
warning: urw-fonts-2.4-14.fc19.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID fb4b18e6: NOKEY
warning: urw-fonts-2.4-14.fc19.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID fb4b18e6: NOKEY
warning: urw-fonts-2.4-14.fc19.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID fb4b18e6: NOKEY
warning: urw-fonts-2.4-14.fc19.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID fb4b18e6: NOKEY
warning: urw-fonts-2.4-14.fc19.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID fb4b18e6: NOKEY
warning: urw-fonts-2.4-14.fc19.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID fb4b18e6: NOKEY
warning: urw-fonts-2.4-14.fc19.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID fb4b18e6: NOKEY
warning: urw-fonts-2.4-14.fc19.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID fb4b18e6: NOKEY
warning: urw-fonts-2.4-14.fc19.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID fb4b18e6: NOKEY
warning: urw-fonts-2.4-14.fc19.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID fb4b18e6: NOKEY
warning: urw-fonts-2.4-14.fc19.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID fb4b18e6: NOKEY
warning: urw-fonts-2.4-14.fc19.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID fb4b18e6: NOKEY
urw-fonts_2.4-15_all.deb generated
ubuntu@ip-10-245-78-162:~$ sudo dpkg -i urw-fonts_2.4-15_all.deb
Selecting previously unselected package urw-fonts.
(Reading database ... 38529 files and directories currently installed.)
Unpacking urw-fonts (from urw-fonts_2.4-15_all.deb) ...
Setting up urw-fonts (2.4-15) ...
Processing triggers for fontconfig ...
ubuntu@ip-10-245-78-162:~$  ./wkhtmltopdf/bin/wkhtmltopdf test.html test.pdf
Loading pages (1/6)
Counting pages (2/6)
Resolving links (4/6)
Loading headers and footers (5/6)
Printing pages (6/6)
Done
ubuntu@ip-10-245-78-162:~$

编辑3

我已经安装了 xvfb-run 包,现在可以通过它运行默认版本(/usr/bin/wkhtmltopdf)。实际上,它能够将简单的 test.html 转换为 pdf,但是,对于带有 Javascript 代码的复杂 html 页面,它却无法做到这一点。似乎 /usr/bin/wkhtmltopdf 无法在正在转换的页面上运行任何 Javascript 代码。

我仍然很困惑为什么编译后的版本不起作用。

编辑4

我对默认的 wkhtmltopdf 版本不公平。它能够理解页面中的Javascript,它成功转换了以下html:

<html>
  <head>
    <style type="text/css">
      body {
        font-family: sans-serif;
      }
    </style>
  </head>
  <body id='body'>
    <script>
      document.getElementById('body').innerHTML = 'Hello world!';
    </script>
  </body>
</html>

我将尝试理解为什么它会在真实页面上失败,但我不知道如何解决它,除非通过丢弃原始页面的碎片来尝试获得最小的失败页面。

编辑5

好的,这是不适用于默认 wkhtmltopdf 版本的最小示例:

<!DOCTYPE html>
<html>
  <head>
    <style type="text/css">
        html, body {
                height: 100%;
                overflow: hidden;
        }
    </style>
  </head>
  <body>
    Hello World!
  </body>
</html>

创建的pdf是空的。这是成绩单:

ubuntu@ip-10-245-78-162:~$ cat test2.html
<!DOCTYPE html>
<html>
  <head>
    <style type="text/css">
        html, body {
                height: 100%;
                overflow: hidden;
        }
    </style>
  </head>
  <body>
    Hello World!
  </body>
</html>
ubuntu@ip-10-245-78-162:~$ xvfb-run /usr/bin/wkhtmltopdf test2.html test2.pdf ; l test2.pdf
Loading page (1/2)
Printing pages (2/2)
Done
-rw-r--r-- 1 ubuntu ubuntu 1266 Mar 31 11:16 test2.pdf
ubuntu@ip-10-245-78-162:~$ cat test2.html |sed 6d | xvfb-run /usr/bin/wkhtmltopdf - test2.pdf ; l test2.pdf
Loading page (1/2)
Printing pages (2/2)
Done
-rw-r--r-- 1 ubuntu ubuntu 4284 Mar 31 11:16 test2.pdf
ubuntu@ip-10-245-78-162:~$

请注意删除第 6 行(高度:100%;)如何更改创建的 pdf 文件的大小。

编辑6

自定义版本是静态链接的,而默认版本依赖于相当多的 WebKit 共享库:

自定义版本:

ubuntu@ip-10-245-78-162:~/wkhtmltopdf/bin$ l wkhtmltopdf
-rwxr-xr-x 1 ubuntu ubuntu 35020224 Mar 31 22:26 wkhtmltopdf
ubuntu@ip-10-245-78-162:~/wkhtmltopdf/bin$ ldd !$
ldd wkhtmltopdf
        linux-vdso.so.1 =>  (0x00007fff195ff000)
        libXrender.so.1 => /usr/lib/x86_64-linux-gnu/libXrender.so.1 (0x00007fefc06db000)
        libX11.so.6 => /usr/lib/x86_64-linux-gnu/libX11.so.6 (0x00007fefc03a7000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fefc01a2000)
        librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fefbff9a000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fefbfd7d000)
        libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fefbfa7c000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fefbf780000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fefbf56a000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fefbf1aa000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fefc08ef000)
        libxcb.so.1 => /usr/lib/x86_64-linux-gnu/libxcb.so.1 (0x00007fefbef8c000)
        libXau.so.6 => /usr/lib/x86_64-linux-gnu/libXau.so.6 (0x00007fefbed88000)
        libXdmcp.so.6 => /usr/lib/x86_64-linux-gnu/libXdmcp.so.6 (0x00007fefbeb82000)
ubuntu@ip-10-245-78-162:~/wkhtmltopdf/bin$

现在默认版本:

ubuntu@ip-10-245-78-162:/usr/bin$ l wkhtmltopdf
-rwxr-xr-x 1 root root 233512 May  7  2011 wkhtmltopdf
ubuntu@ip-10-245-78-162:/usr/bin$ ldd wkhtmltopdf
        linux-vdso.so.1 =>  (0x00007fff031ff000)
        libQtWebKit.so.4 => /usr/lib/x86_64-linux-gnu/libQtWebKit.so.4 (0x00007f28a33bc000)
        libQtGui.so.4 => /usr/lib/x86_64-linux-gnu/libQtGui.so.4 (0x00007f28a26ee000)
        libQtNetwork.so.4 => /usr/lib/x86_64-linux-gnu/libQtNetwork.so.4 (0x00007f28a23a1000)
        libQtCore.so.4 => /usr/lib/x86_64-linux-gnu/libQtCore.so.4 (0x00007f28a1ecf000)
        libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f28a1bcf000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f28a19b8000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f28a15f9000)
        libsqlite3.so.0 => /usr/lib/x86_64-linux-gnu/libsqlite3.so.0 (0x00007f28a1356000)
        libXrender.so.1 => /usr/lib/x86_64-linux-gnu/libXrender.so.1 (0x00007f28a114b000)
        libgstapp-0.10.so.0 => /usr/lib/x86_64-linux-gnu/libgstapp-0.10.so.0 (0x00007f28a0f3f000)
        libgstinterfaces-0.10.so.0 => /usr/lib/x86_64-linux-gnu/libgstinterfaces-0.10.so.0 (0x00007f28a0d2d000)
        libgstpbutils-0.10.so.0 => /usr/lib/x86_64-linux-gnu/libgstpbutils-0.10.so.0 (0x00007f28a0b09000)
        libgstvideo-0.10.so.0 => /usr/lib/x86_64-linux-gnu/libgstvideo-0.10.so.0 (0x00007f28a08ed000)
        libgstbase-0.10.so.0 => /usr/lib/x86_64-linux-gnu/libgstbase-0.10.so.0 (0x00007f28a069a000)
        libgstreamer-0.10.so.0 => /usr/lib/x86_64-linux-gnu/libgstreamer-0.10.so.0 (0x00007f28a03b2000)
        libgobject-2.0.so.0 => /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0 (0x00007f28a0163000)
        libglib-2.0.so.0 => /lib/x86_64-linux-gnu/libglib-2.0.so.0 (0x00007f289fe6e000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f289fc50000)
        libX11.so.6 => /usr/lib/x86_64-linux-gnu/libX11.so.6 (0x00007f289f91c000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f289f620000)
        libfontconfig.so.1 => /usr/lib/x86_64-linux-gnu/libfontconfig.so.1 (0x00007f289f3e9000)
        libaudio.so.2 => /usr/lib/x86_64-linux-gnu/libaudio.so.2 (0x00007f289f1d1000)
        libpng12.so.0 => /lib/x86_64-linux-gnu/libpng12.so.0 (0x00007f289efa9000)
        libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f289ed91000)
        libfreetype.so.6 => /usr/lib/x86_64-linux-gnu/libfreetype.so.6 (0x00007f289eaf5000)
        libSM.so.6 => /usr/lib/x86_64-linux-gnu/libSM.so.6 (0x00007f289e8ed000)
        libICE.so.6 => /usr/lib/x86_64-linux-gnu/libICE.so.6 (0x00007f289e6d2000)
        libXi.so.6 => /usr/lib/x86_64-linux-gnu/libXi.so.6 (0x00007f289e4c3000)
        libXext.so.6 => /usr/lib/x86_64-linux-gnu/libXext.so.6 (0x00007f289e2b2000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f289e0ad000)
        librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f289dea5000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f28a517e000)
        liborc-0.4.so.0 => /usr/lib/x86_64-linux-gnu/liborc-0.4.so.0 (0x00007f289dc29000)
        libgmodule-2.0.so.0 => /usr/lib/x86_64-linux-gnu/libgmodule-2.0.so.0 (0x00007f289da25000)
        libxml2.so.2 => /usr/lib/x86_64-linux-gnu/libxml2.so.2 (0x00007f289d6ca000)
        libffi.so.6 => /usr/lib/x86_64-linux-gnu/libffi.so.6 (0x00007f289d4c1000)
        libpcre.so.3 => /lib/x86_64-linux-gnu/libpcre.so.3 (0x00007f289d284000)
        libxcb.so.1 => /usr/lib/x86_64-linux-gnu/libxcb.so.1 (0x00007f289d065000)
        libexpat.so.1 => /lib/x86_64-linux-gnu/libexpat.so.1 (0x00007f289ce3b000)
        libXt.so.6 => /usr/lib/x86_64-linux-gnu/libXt.so.6 (0x00007f289cbd5000)
        libXau.so.6 => /usr/lib/x86_64-linux-gnu/libXau.so.6 (0x00007f289c9d1000)
        libuuid.so.1 => /lib/x86_64-linux-gnu/libuuid.so.1 (0x00007f289c7cc000)
        libXdmcp.so.6 => /usr/lib/x86_64-linux-gnu/libXdmcp.so.6 (0x00007f289c5c5000)
ubuntu@ip-10-245-78-162:/usr/bin$

编辑7

伙计们,我不明白 wkhtmltopdf 如何为您工作。我从头开始,完全:

  1. 创建了一个全新的 Ubuntu Amazon 微型实例(免费套餐)
  2. sudo apt-get 更新
  3. sudo apt-get 升级
  4. sudo apt-get install libx11-dev
  5. sudo apt-get install libfontconfig1-dev
  6. wget https://wkhtmltopdf.googlecode.com/files/wkhtmltopdf-0.11.0_rc1-static-amd64.tar.bz2
  7. tar xjf wkhtmltopdf-0.11.0_rc1-static-amd64.tar.bz2
  8. 使用来自EDIT5的内容创建了 test2.html (参见EDIT5成绩单)
  9. 在 test2.html 上运行 wkhtmltopdf-amd64。生成的pdf是空的!
  10. 从 test2.html 中删除第 6 行或第 7 行(CSS 属性宽度或溢出),它突然就起作用了!

任何人都可以追溯我的步骤并确认吗?

编辑8

在我的笔记本电脑上的 VMWare 虚拟机中安装了 CentOS 6.4。结果相同。wkhtmltopdf 不适用于上述琐碎的 html 文件。

4

1 回答 1

3

尝试在您的 html 头标签中设置字符集声明,如下所示:

<head>
  <meta charset="utf-8">
  ...
</head>
于 2013-06-06T11:28:39.127 回答