我们在使用 Reportlab 格式化一些最初是 html 的内容时遇到了问题,有时 html 过于复杂。解决方案(我在这里不相信,这是来自 Reportlab 的人)是在错误发生时捕获错误并将其直接输出到 PDF 中。
这意味着您可以在正确的上下文中看到问题的原因。您可以对此进行扩展以输出异常的详细信息,但在我们的例子中,由于我们的问题是将 html 转换为 rml,我们只需要显示我们的输入:
预科生模板包含以下内容:
{{script}}
#This section contains python functions used within the rml.
#we can import any helper code we need within the template,
#to save passing in hundreds of helper functions at the top
from rml_helpers import blocks
{{endscript}}
然后是一些模板,如:
{{if equip.specification}}
<condPageBreak height="1in"/>
<para style="h2">Item specification</para>
{{blocks(equip.specification)}}
{{endif}}
在 rml_helpers.py 我们有:
from xml.sax.saxutils import escape
from rlextra.radxml.html_cleaner import cleanBlocks
from rlextra.radxml.xhtml2rml import xhtml2rml
def q(stuff):
"""Quoting function which works with unicode strings.
The data from Zope is Unicode objects. We need to explicitly
convert to UTF8; then escape any ampersands. So
u"Black & Decker drill"
becomes
"Black & Decker drill"
and any special characters (Euro, curly quote etc) end up
suitable for XML. For completeness we'll accept 'None'
objects as well and output an empty string.
"""
if stuff is None:
return ''
elif isinstance(stuff,unicode):
stuff = escape(stuff.encode('utf8'))
else:
stuff = escape(str(stuff))
return stuff.replace('"','"').replace("'", ''')
def blocks(txt):
try:
txt2 = cleanBlocks(txt)
rml = xhtml2rml(txt2)
return rml
except:
return '<para style="big_warning">Could not process markup</para><para style="normal">%s</para>' % q(txt)
因此,任何太复杂而xhtml2rml
无法处理的东西都会引发异常,并在输出中被一个大警告“无法处理标记”替换,然后是导致错误的标记,转义,因此它显示为文字。
然后我们要做的就是记住在输出 PDF 中搜索错误消息并相应地修复输入。