python - 如何将网页页面或 HTML url 转换为 pdf？

Question

我正在尝试将 HTML 页面或 HTML URL 转换为 pdf，它不仅可以转换 html，还可以转换 css 并保存它。我很困惑我应该使用什么（weasyprint、wkhtmltopdf 或 python pdfkit）。同时我正在使用这段代码：

def ConvertToPdf(urltoConvert=None):
    import pdfkit
    pdfFormatOptions= {'page-size':'Letter', 'disable-forms':'','zoom': 1}
    pdfObject = None
    try:
        pdfkit.from_url('http://tdi.dartmouth.edu/', 'dart.pdf')
    except:
       Exception while converting"

        pass
    return pdfObject
if __name__ == "__main__":
  #  url ='http://tdi.dartmouth.edu/'
    ConvertToPdf()

而这段代码

import weasyprint
pdf = weasyprint.HTML('http://tdi.dartmouth.edu/').write_pdf()
len(pdf)
file('dart.pdf', 'w').write(pdf)

但一切都是徒劳的，请帮助。

score 0 · Accepted Answer

您可能想尝试使用： https ://pypi.python.org/pypi/pdfkit

它还具有保存 CSS 的功能

You can specify external CSS files when converting files or strings using css option.

Warning This is a workaround for this bug in wkhtmltopdf. You should try –user-style-sheet option first.

# Single CSS file
css = 'example.css'
pdfkit.from_file('file.html', options=options, css=css)

# Multiple CSS files
css = ['example.css', 'example2.css']
pdfkit.from_file('file.html', options=options, css=css)

score 0 · Accepted Answer

这应该可以正常工作

import pdfkit
pdfkit.from_url('http://google.com', 'res.pdf')

此外，另一种解决方案可能是通过 selenium 制作屏幕截图并从这些图像中编写 .pdf。但是，它很脏。

python - 如何将网页页面或 HTML url 转换为 pdf？

2 回答 2

Related

Reference