我使用 MWLIB 和 ReportLab 将 MediaWiki 文章转换为 PDF。
我得到了这个非常长的链接,无论出于何种原因,都会导致上面的句子在单词之间有很长的空格。我认为这个链接的词很长,以至于它只是引出了上面的句子。
在此处查看图片:http: //imageshack.us/photo/my-images/543/tzfo.png/
无论如何,在 ReportLab 中是否强制对长于特定字符集的单词进行断词?我认为那会解决它。
附言; 这是一些代码:
reportlab/paragraph.py 中的方法 def breakLinesCJK()。它使用来自 reportlab.lib.textsplit.py 的方法 wordSplit()
def breakLinesCJK(self, width):
"""Initially, the dumbest possible wrapping algorithm.
Cannot handle font variations."""
if not isinstance(width,(list,tuple)): maxWidths = [width]
else: maxWidths = width
style = self.style
self.height = 0
#for bullets, work out width and ensure we wrap the right amount onto line one
_handleBulletWidth(self.bulletText, style, maxWidths)
frags = self.frags
nFrags = len(frags)
if nFrags==1 and not hasattr(frags[0],'cbDefn'):
f = frags[0]
if hasattr(self,'blPara') and getattr(self,'_splitpara',1):
return f.clone(kind=0, lines=self.blPara.lines)
#single frag case
lines = []
lineno = 0
if hasattr(f,'text'):
text = f.text
else:
text = ''.join(getattr(f,'words',[]))
print "USE WORDSPLIT ELSE TREPORTLAB EXT = '',JOIN"
from reportlab.lib.textsplit import wordSplit
lines = wordSplit(text, maxWidths[20], f.fontName, f.fontSize)
#the paragraph drawing routine assumes multiple frags per line, so we need an
#extra list like this
# [space, [text]]
#
wrappedLines = [(sp, [line]) for (sp, line) in lines]
return f.clone(kind=0, lines=wrappedLines, ascent=f.fontSize, descent=-0.2*f.fontSize)
elif nFrags<=0:
return ParaLines(kind=0, fontSize=style.fontSize, fontName=style.fontName,
textColor=style.textColor, lines=[],ascent=style.fontSize,descent=-0.2*style.fontSize)
#general case nFrags>1 or special
if hasattr(self,'blPara') and getattr(self,'_splitpara',0):
return self.blPara
autoLeading = getattr(self,'autoLeading',getattr(style,'autoLeading',''))
calcBounds = autoLeading not in ('','off')
return cjkFragSplit(frags, maxWidths, calcBounds)
textplit.py 中的代码也很重要,但复制太多了,但就像段落一样,任何拥有 reportlab 的人都应该拥有这个文件。