1

Using PdfReader along with ReportLab, I am attempting to pull in a PDF page, save it (both successful), then pull in a multi-page PDF and do the same. I know how how to pull in a PDF one page at a time, but I'm struggling to pull in more than one page.

from reportlab.pdfgen import canvas
from pdfrw import PdfReader
from pdfrw.buildxobj import pagexobj
from pdfrw.toreportlab import makerl

c = canvas.Canvas(Out_Folder+pdf_file_name)
c.setPageSize([11*inch, 8.5*inch])

page = PdfReader(folder+'2_VisionMissionValues.pdf',decompress=False).pages
p = pagexobj(page[0])
c.setPageSize([11*inch, 8.5*inch]) #Set page size (for landscape)
c.doForm(makerl(c, p))
c.showPage()

p3_ = PdfReader(m4folder+'Academy.pdf',decompress=False).pages

Here's where I'm lost. I know this works for just pulling in the first page....

p3 = pagexobj(p3_[0])

But if I want to pull in all pages of the PDF, I'm not sure what to do. I tried this:

p3 = [pagexobj(x) for x in p3_[:]]

but it resulted in an Assertion Error (see below).

c.setPageSize([8.5*inch, 11*inch]) #Set page size (for portrait)
c.doForm(makerl(c, p3))
c.showPage()
c.save()


AssertionError: [{'/BBox': [0.0, 0.0, 792.0, 612.0], '/Filter': '/FlateDecode', '/FormType': 1, '/Matrix': [0, 1, -1, 0, 0, 0], '/Length': '56', '/Subtype': '/Form', '/Resources': {'/ProcSet': ['/PDF', '/ImageB', '/ImageC', '/ImageI'], '/XObject': {'/Im1': (8, 0)}}, '/Type': '/XObject'}, {'/BBox': [0.0, 0.0, 792.0, 612.0], '/Filter': '/FlateDecode', '/FormType': 1, '/Matrix': [0, 1, -1, 0, 0, 0], '/Length': '56', '/Subtype': '/Form', '/Resources': {'/ProcSet': ['/PDF', '/ImageB', '/ImageC', '/ImageI'], '/XObject': {'/Im2': (17, 0)}}, '/Type': '/XObject'}]
4

2 回答 2

3

The reportlab canvas only works on one page at a time, so you need to use the reportlab doForm() and showPage() functions once per output page, not on all the pages as a list.

Edited to add

I just remembered that I have some sample code that will copy a subset of the pages of a PDF file to an output file using reportlab here. The inner loop does this:

for page in pages:
    canvas.setPageSize((page.BBox[2], page.BBox[3]))
    canvas.doForm(makerl(canvas, page))
    canvas.showPage()

For what it's worth, if you're only copying pages, you don't need reportlab; there is a similar subset example in the directory above that does it solely with pdfrw.

(Disclaimer: I am the primary pdfrw author.)

于 2017-05-03T11:21:16.270 回答
1

I hope, this answer would help in generating multiple pages on the same pdf file using Canvas. Based on the Reportlab userguide :

The showPage method causes the canvas to stop drawing on the current page and any further operationswill draw on a subsequent page (if there are any further operations -- if not no new page is created). Thesave method must be called after the construction of the document is complete -- it generates the PDFdocument, which is the whole purpose of the canvas object.

Here, is a simple example.

from reportlab.pdfgen.canvas import Canvas

def write(myfile, page_number):
    myfile.drawString(200, 600, 'Page number %i script' % page_number)

myfile = Canvas('multi_pages.pdf')
total_pages = 3

for i in range(total_pages):
    write(myfile, i)
    myfile.showPage()

myfile.save()
于 2021-02-10T11:59:49.357 回答