1

我正在使用 GemBox.Pdf 合并 PDF 文件,如下所示。这很好用,我可以轻松添加轮廓。

我以前做过类似的事情,并将 Word 文件与 GemBox.Document 合并,如下所示

但现在我的问题是 GemBox.Pdf 中没有 TOC 元素。我想在将多个 PDF 文件合并为一个时自动获取目录。

我是否遗漏了某些东西,或者 PDF 真的没有这样的元素?
我是否需要重新创建它,如果是,那么我将如何做到这一点?
我可以添加书签,但我不知道如何添加到它的链接。

4

1 回答 1

3

PDF文件中没有这样的元素,所以我们需要自己创建这个内容。

现在一种方法是创建文本元素、轮廓和链接注释,适当地定位它们,并将链接目标设置为轮廓。

然而,这可能是一项相当大的工作,因此使用 GemBox.Document 创建所需的 TOC 元素,将其保存为 PDF 文件,然后将其导入生成的 PDF 可能会更容易。

// Source data for creating TOC entries with specified text and associated PDF files.
var pdfEntries = new[]
{
    new { Title = "First Document Title", Pdf = PdfDocument.Load("input1.pdf") },
    new { Title = "Second Document Title", Pdf = PdfDocument.Load("input2.pdf") },
    new { Title = "Third Document Title", Pdf = PdfDocument.Load("input3.pdf") },
};

/***************************************************************/
/* Create new document with TOC element using GemBox.Document. */
/***************************************************************/

// Create new document.
var tocDocument = new DocumentModel();
var section = new Section(tocDocument);
tocDocument.Sections.Add(section);

// Create and add TOC element.
var toc = new TableOfEntries(tocDocument, FieldType.TOC);
section.Blocks.Add(toc);
section.Blocks.Add(new Paragraph(tocDocument, new SpecialCharacter(tocDocument, SpecialCharacterType.PageBreak)));

// Create heading style.
// By default, when updating TOC element a TOC entry is created for each paragraph that has heading style.
var heading1Style = (ParagraphStyle)tocDocument.Styles.GetOrAdd(StyleTemplateType.Heading1);

// Add heading and empty (placeholder) pages.
// The number of added placeholder pages depend on the number of pages that actual PDF file has so that TOC entries have correct page numbers.
int totalPageCount = 0;
foreach (var pdfEntry in pdfEntries)
{
    section.Blocks.Add(new Paragraph(tocDocument, pdfEntry.Title) { ParagraphFormat = { Style = heading1Style } });
    section.Blocks.Add(new Paragraph(tocDocument, new SpecialCharacter(tocDocument, SpecialCharacterType.PageBreak)));

    int currentPageCount = pdfEntry.Pdf.Pages.Count;
    totalPageCount += currentPageCount;

    while (--currentPageCount > 0)
        section.Blocks.Add(new Paragraph(tocDocument, new SpecialCharacter(tocDocument, SpecialCharacterType.PageBreak)));
}

// Remove last extra-added empty page.
section.Blocks.RemoveAt(section.Blocks.Count - 1);

// Update TOC element and save the document as PDF stream.
toc.Update();
var pdfStream = new MemoryStream();
tocDocument.Save(pdfStream, new GemBox.Document.PdfSaveOptions());

/***************************************************************/
/* Merge PDF files into PDF with TOC element using GemBox.Pdf. */
/***************************************************************/

// Load a PDF stream using GemBox.Pdf.
var pdfDocument = PdfDocument.Load(pdfStream);
var rootDictionary = (PdfDictionary)((PdfIndirectObject)pdfDocument.GetDictionary()[PdfName.Create("Root")]).Value;
var pagesDictionary = (PdfDictionary)((PdfIndirectObject)rootDictionary[PdfName.Create("Pages")]).Value;
var kidsArray = (PdfArray)pagesDictionary[PdfName.Create("Kids")];
var pageIds = kidsArray.Cast<PdfIndirectObject>().Select(obj => obj.Id).ToArray();

// Remove empty (placeholder) pages.
while (totalPageCount-- > 0)
    pdfDocument.Pages.RemoveAt(pdfDocument.Pages.Count - 1);

// Add pages from PDF files.
foreach (var pdfEntry in pdfEntries)
    foreach (var page in pdfEntry.Pdf.Pages)
        pdfDocument.Pages.AddClone(page);

/*****************************************************************************/
/* Update TOC links from placeholder pages to actual pages using GemBox.Pdf. */
/*****************************************************************************/

// Create a mapping from an ID of a empty (placeholder) page indirect object to an actual page indirect object.
var pageCloneMap = new Dictionary<PdfIndirectObjectIdentifier, PdfIndirectObject>();
for (int i = 0; i < kidsArray.Count; ++i)
    pageCloneMap.Add(pageIds[i], (PdfIndirectObject)kidsArray[i]);

foreach (var entry in pageCloneMap)
{
    // If page was updated, it means that we passed TOC pages, so break from the loop.
    if (entry.Key != entry.Value.Id)
        break;

    // For each TOC page, get its 'Annots' entry.
    // For each link annotation from the 'Annots' get the 'Dest' entry.
    // Update the first item in the 'Dest' array so that it no longer points to a removed page.
    if (((PdfDictionary)entry.Value.Value).TryGetValue(PdfName.Create("Annots"), out PdfBasicObject annotsObj))
        foreach (PdfIndirectObject annotObj in (PdfArray)annotsObj)
            if (((PdfDictionary)annotObj.Value).TryGetValue(PdfName.Create("Dest"), out PdfBasicObject destObj))
            {
                var destArray = (PdfArray)destObj;
                destArray[0] = pageCloneMap[((PdfIndirectObject)destArray[0]).Id];
            }
}

// Save resulting PDF file.
pdfDocument.Save("Result.pdf");
pdfDocument.Close();

这样,您可以使用 TOC 开关和样式轻松自定义 TOC 元素。有关详细信息,请参阅GemBox.Document中的目录示例

于 2021-07-14T08:08:30.520 回答