c# - 如何以编程方式检测 XFA（Adobe XML Forms Architecture）动态 PDF

Question

我有一个将 pdf 转换为 tif 的系统。基本上它是一个用 csharp 编写的程序，它使用 iTextSharp 获取有关 pdf 和 pdf2tif (http://pdftotif.sourceforge.net/) 的元数据以转换为文件。我注意到许多 pdf 不能正确转换。在 Acrobat 和 Foxit 中，它们以多页形式打开，但在任何其他查看器（Ghostscript ...）中，它们以带有消息的 1 页文档形式打开

“要查看本文档的完整内容，您需要更高版本的 PDF 查看器。您可以从“www.adobe.com/products/acrobat/readstep2.html”升级到最新版本的 Adobe Reader 以获得进一步支持，转到http://www.adobe.com/support/products/acrreader.html “

一些环顾四周的人告诉我，这些是 XFA 动态 PDF，有什么方法可以以编程方式检测到，所以我可以尝试以不同的方式处理这些 pdf？

score 2 · Accepted Answer

iText API是一个好的开始。

在 iTextSharp 中，您访问对象的属性而不是调用方法。（如果您使用 iTextSharp 完成了适量的工作，您可能已经知道这一点）

无论如何，这是一个使用HTTP Handler的简单示例：

<%@ WebHandler Language="C#" Class="iTextXfa" %>
using System;
using System.Web;
using iTextSharp.text;  
using iTextSharp.text.pdf;

public class iTextXfa : IHttpHandler {
  public void ProcessRequest (HttpContext context) {
    HttpServerUtility Server = context.Server;
    string[] testFiles = { 
      Server.MapPath("./non-XFA.pdf"), Server.MapPath("./XFA.pdf") 
    };
    foreach (string file in testFiles) {
      XfaForm xfa = new XfaForm(new PdfReader(file));
      context.Response.Write(string.Format(
        "<p>File: {0} is XFA: {1}</p>",
        file,
        xfa.XfaPresent ? "YES" : "NO"
      ));
    }
  }
  public bool IsReusable { get { return false; } }
}

score 0 · Accepted Answer

命令行方法：

strings document.pdf | grep XFA

如果你得到一两行，你可能正在使用 XFA PDF：

<</Names[(!ADBE::0100_VersChkStrings) 364 0 R(!ADBE::0100_VersChkVars) 365 0 R(!ADBE::0200_VersChkCode_XFACheck) 366 0 R]>>

c# - 如何以编程方式检测 XFA（Adobe XML Forms Architecture）动态 PDF

2 回答 2

Related

Reference