0

我正在尝试编写一些代码来抓取网页。我正在使用 spynner 获取 html 代码并将其传递给比萨

运行 python 代码时我没有看到任何错误,但是生成的 pdf 都是错误的。

这是我使用的代码 -

import os
import sys
import ho.pisa as pisa
import spynner

import logging
class PisaNullHandler(logging.Handler):
    def emit(self, record):
        pass


url = 'http://www.google.com'

br = spynner.Browser()

br.load(url)

pathToWrite = './google.html.pdf'

htmlCode = br.html

pdfFile = file(pathToWrite, "wb+")
try:
    logging.getLogger("ho.pisa").addHandler(PisaNullHandler())
    pdfStatus = pisa.CreatePDF(htmlCode.encode("utf-8"), pdfFile, encoding="utf8" )
    if not pdfStatus.err:
        pdfFile.flush()
    else:
        print 'Failed with error ' + pdfStatus.error
except Exception as e:
    print 'pdf creation failed with error ' + str(e)

我试图从 spynner 保存 html 并通过 xhtml2pdf 运行它。我收到以下错误 -

ERROR [ho.pisa] Document error
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/pisa-3.0.33-py2.7.egg/sx/pisa3/pisa_document.py", line 128, in pisaDocument
    c = pisaStory(src, path, link_callback, debug, default_css, xhtml, encoding, c=c, xml_output=xml_output)
  File "/usr/local/lib/python2.7/dist-packages/pisa-3.0.33-py2.7.egg/sx/pisa3/pisa_document.py", line 73, in pisaStory
    pisaParser(src, c, default_css, xhtml, encoding, xml_output)
  File "/usr/local/lib/python2.7/dist-packages/pisa-3.0.33-py2.7.egg/sx/pisa3/pisa_parser.py", line 626, in pisaParser
    c.parseCSS()
  File "/usr/local/lib/python2.7/dist-packages/pisa-3.0.33-py2.7.egg/sx/pisa3/pisa_context.py", line 545, in parseCSS
    self.css = self.cssParser.parse(self.cssText)
  File "/usr/local/lib/python2.7/dist-packages/pisa-3.0.33-py2.7.egg/sx/w3c/cssParser.py", line 358, in parse
    src, stylesheet = self._parseStylesheet(src)
  File "/usr/local/lib/python2.7/dist-packages/pisa-3.0.33-py2.7.egg/sx/w3c/cssParser.py", line 458, in _parseStylesheet
    src, ruleset = self._parseRuleset(src)
  File "/usr/local/lib/python2.7/dist-packages/pisa-3.0.33-py2.7.egg/sx/w3c/cssParser.py", line 737, in _parseRuleset
    src, properties = self._parseDeclarationGroup(src.lstrip())
  File "/usr/local/lib/python2.7/dist-packages/pisa-3.0.33-py2.7.egg/sx/w3c/cssParser.py", line 905, in _parseDeclarationGroup
    src, property = self._parseDeclaration(src)
  File "/usr/local/lib/python2.7/dist-packages/pisa-3.0.33-py2.7.egg/sx/w3c/cssParser.py", line 945, in _parseDeclaration
    src, property = self._parseDeclarationProperty(src, propertyName)
  File "/usr/local/lib/python2.7/dist-packages/pisa-3.0.33-py2.7.egg/sx/w3c/cssParser.py", line 953, in _parseDeclarationProperty
    src, expr = self._parseExpression(src)
  File "/usr/local/lib/python2.7/dist-packages/pisa-3.0.33-py2.7.egg/sx/w3c/cssParser.py", line 968, in _parseExpression
    src, term = self._parseExpressionTerm(src)
  File "/usr/local/lib/python2.7/dist-packages/pisa-3.0.33-py2.7.egg/sx/w3c/cssParser.py", line 1020, in _parseExpressionTerm
    raise self.ParseError('Terminal function expression expected closing \')\'', src, ctxsrc)
CSSParseError: Terminal function expression expected closing ')':: (u'alpha(opacity', u'=100);position:absol')
*** ERRORS OCCURED
4

0 回答 0