asp.net - 如何使用 iTextSharp 在 pdf 文件中获取部分目标页码？

Question

我有一个 pdf 文件，其中包含索引页面，其中包含目标页面的部分。我可以获得部分名称（第 1.1 节，第 5.2 节），但我无法获得目标页码...

例如： http: //www.mikesdotnetting.com/Article/84/iTextSharp-Links-and-Bookmarks

这是我的代码：

string FileName = AppDomain.CurrentDomain.BaseDirectory + "TestPDF.pdf";
PdfReader pdfreader = new PdfReader(FileName);
PdfDictionary PageDictionary = pdfreader.GetPageN(9);
PdfArray Annots = PageDictionary.GetAsArray(PdfName.ANNOTS);       
if ((Annots == null) || (Annots.Length == 0))
    return;

foreach (PdfObject oAnnot in Annots.ArrayList)
{
    PdfDictionary AnnotationDictionary = (PdfDictionary)PdfReader.GetPdfObject(oAnnot);          

    if (AnnotationDictionary.Keys.Contains(PdfName.A))
    {
        PdfDictionary oALink = AnnotationDictionary.GetAsDict(PdfName.A);

        if (oALink.Get(PdfName.S).Equals(PdfName.GOTO))
        {
            if (oALink.Keys.Contains(PdfName.D))
            {
                PdfObject objs = oALink.Get(PdfName.D);
                if (objs.IsString())
                {
                    string SectionName = objs.ToString(); // here i could see the section name...
                }
            }
        }
    }
}

我如何获得目标页码？

我也无法访问某些 pdf 前的部分名称：http ://wwwimages.adobe.com/www.adobe.com/content/dam/Adobe/en/devnet/pdf/pdfs/adobe_supplement_iso32000.pdf

在此 PDF 第 9 页中包含我无法获取该部分的部分。所以请给我解决方案....

score 3 · Accepted Answer

有两种可能的链接注释类型，要么A要么Dest。是更强大的A类型，但往往是矫枉过正。该Dest类型仅指定对页面的间接引用以及一些拟合和缩放选项。

该Dest值可以是几个不同的东西，但通常（据我所见）是一个命名的字符串目标。您可以在文档的名称目标字典中查找命名目标。所以在你的主循环之前添加这个，以便以后可以引用它：

//Get all existing named destinations
Dictionary<string, PdfObject> dests = pdfreader.GetNamedDestinationFromStrings();

一旦你得到Dest了字符串，你就可以在上面的字典中将该对象作为键查找。

PdfArray thisDest = (PdfArray)dests[AnnotationDictionary.GetAsString(PdfName.DEST).ToString()];

返回的数组中的第一项是您习惯的间接引用。（实际上，第一项可能是一个整数，表示远程文档中的页码，因此您可能需要检查它。）

PdfIndirectReference a = (PdfIndirectReference)thisDest[0];
PdfObject thisPage = PdfReader.GetPdfObject(a);

以下是将上述大部分内容放在一起的代码，省略了您已经拥有的一些代码。A并且Dest根据规范是互斥的，因此任何注释都不应该同时指定。

//Get all existing named desitnations
Dictionary<string, PdfObject> dests = pdfreader.GetNamedDestinationFromStrings();

foreach (PdfObject oAnnot in Annots.ArrayList) {
    PdfDictionary AnnotationDictionary = (PdfDictionary)PdfReader.GetPdfObject(oAnnot);

    if (AnnotationDictionary.Get(PdfName.SUBTYPE).Equals(PdfName.LINK)) {
        if (AnnotationDictionary.Contains(PdfName.A)) {
            //...Do normal A stuff here
        } else if (AnnotationDictionary.Contains(PdfName.DEST)) {
            if (AnnotationDictionary.Get(PdfName.DEST).IsString()) {//Named-based destination
                if (dests.ContainsKey(AnnotationDictionary.GetAsString(PdfName.DEST).ToString())) {//See if it exists in the global name dictionary
                    PdfArray thisDest = (PdfArray)dests[AnnotationDictionary.GetAsString(PdfName.DEST).ToString()];//Get the destination
                    PdfIndirectReference a = (PdfIndirectReference)thisDest[0];//TODO, this could actually be an integer for the case of Remote Destinations
                    PdfObject thisPage = PdfReader.GetPdfObject(a);//Get the actual PDF object
                }
            } else if(AnnotationDictionary.Get(PdfName.DEST).IsArray()) {
                //Technically possible, I think the array matches the code directly above but I don't have a sample PDF
            }
        }
    }
}

asp.net - 如何使用 iTextSharp 在 pdf 文件中获取部分目标页码？

1 回答 1

Related

Reference