c# - itextsharp：复制页面上的意外元素

Question

这是拆分 PDF 文档的已知代码：

        try
        {
            FileInfo file = new FileInfo(@"d:\С.pdf");
            string name = file.Name.Substring(0, file.Name.LastIndexOf("."));
            // we create a reader for a certain document
            PdfReader reader = new PdfReader(@"d:\С.pdf");
            // we retrieve the total number of pages
            int n = reader.NumberOfPages;
            int digits = 1 + (n / 10);
            System.Console.WriteLine("There are " + n + " pages in the original file.");
            Document document;
            int pagenumber;
            string filename;

            for (int i = 0; i < n; i++)
            {
                pagenumber = i + 1;
                filename = pagenumber.ToString();
                while (filename.Length < digits) filename = "0" + filename;
                filename = "_" + filename + ".pdf";
                // step 1: creation of a document-object
                document = new Document(reader.GetPageSizeWithRotation(pagenumber));
                // step 2: we create a writer that listens to the document
                PdfWriter writer = PdfWriter.GetInstance(document, new FileStream(name + filename, FileMode.Create));
                // step 3: we open the document
                document.Open();
               PdfContentByte cb = writer.DirectContent;
                PdfImportedPage page = writer.GetImportedPage(reader, pagenumber);
                int rotation = reader.GetPageRotation(pagenumber);
                if (rotation == 90 || rotation == 270)
                {
                    cb.AddTemplate(page, 0, -1f, 1f, 0, 0, reader.GetPageSizeWithRotation(pagenumber).Height);
                }
                else
                {
                    cb.AddTemplate(page, 1f, 0, 0, 1f, 0, 0);
                }
                // step 5: we close the document
                document.Close();
            }
        }
        catch (DocumentException de)
        {
            System.Console.Error.WriteLine(de.Message);
        }
        catch (IOException ioe)
        {
            System.Console.Error.WriteLine(ioe.Message);
        }

这是一个拆分页面的左上角：在此处输入图像描述

你可以在这里（和其他角落）看到意想不到的线条、圆形……我怎样才能避免它们？

score 1 · Accepted Answer

正如之前多次解释的那样（ITextSharp 包括输入文件中的所有页面，Itext pdf Merge : Document overflow outside pdf (Text truncated) page and not displayed，等等），您应该阅读我的iText in Action一书的第 6 章（您可以在此处找到示例的 C# 版本）。

您正在使用和的组合Document来拆分 PDF。请告诉我是谁让你这样做的，这样我就可以诅咒启发你的人（因为我已经回答了数百次这个问题，而且我已经厌倦了重复自己）。这些类不是该工作的好选择：PdfWriterPdfImportedPage

你失去了所有的交互性，
如果页面是横向的，您需要自己旋转内容（您已经发现了这一点），
您需要考虑原始页面大小，
...

您的问题类似于Itext pdf Merge : Document overflow outside pdf (Text truncated) page and not displayed。显然，您尝试拆分的原始文档包含一个 MediaBox 和一个 CropBox。查看原始文档时，仅显示 CropBox 内的内容。当您查看您的副本时，MediaBox 内的内容会显示出来，显示“打印机标记”。这些打印机标记显示了在发布环境中需要剪切页面的位置。打印书籍或杂志时，打印内容的页面通常比最后一页大。多余的内容在组装书籍或杂志之前被剪掉。

长话短说：阅读文档，替换PdfWriter为PdfCopy，替换AddTemplate()为AddPage().

c# - itextsharp：复制页面上的意外元素

1 回答 1

Related

Reference