这似乎是一件应该很快做的事情,但在实践中似乎存在问题。我有一堆包含表单域和嵌入式 javascript 的 PDF 表单。我想安全地删除 javascript 代码,但保留 PDF 表单字段不变。
到目前为止,我已经找到了很多解决方案,但是所有解决方案要么消除了 javascript 和表单字段,要么都保持原样。
这是解决方案A;它复制表单字段和javascript:
var pdfReader = new PdfReader(infilename);
using (MemoryStream memoryStream = new MemoryStream()) {
PdfCopyFields copy = new PdfCopyFields(memoryStream);
copy.AddDocument(pdfReader);
copy.Close();
File.WriteAllBytes(rawfilename, memoryStream.ToArray());
}
或者,我有解决方案 B,它去掉了表单字段和 javascript:
Document document = new Document();
using (MemoryStream memoryStream = new MemoryStream()) {
PdfWriter writer = PdfWriter.GetInstance(document, memoryStream);
document.Open();
document.AddDocListener(writer);
for (int p = 1; p <= pdfReader.NumberOfPages; p++) {
document.SetPageSize(pdfReader.GetPageSize(p));
document.NewPage();
PdfContentByte cb = writer.DirectContent;
PdfImportedPage pageImport = writer.GetImportedPage(pdfReader, p);
int rot = pdfReader.GetPageRotation(p);
if (rot == 90 || rot == 270) {
cb.AddTemplate(pageImport, 0, -1.0F, 1.0F, 0, 0, pdfReader.GetPageSizeWithRotation(p).Height);
} else {
cb.AddTemplate(pageImport, 1.0F, 0, 0, 1.0F, 0, 0);
}
}
document.Close();
File.WriteAllBytes(rawfile, memoryStream.ToArray());
}
有谁知道如何修改解决方案 A 或 B 以消除 javascript 但保留表单字段?
编辑:解决方案代码在这里!
using (MemoryStream memoryStream = new MemoryStream()) {
PdfStamper stamper = new PdfStamper(pdfReader, memoryStream);
for (int i = 0; i <= pdfReader.XrefSize; i++) {
object o = pdfReader.GetPdfObject(i);
PdfDictionary pd = o as PdfDictionary;
if (pd != null) {
pd.Remove(PdfName.AA);
pd.Remove(PdfName.JS);
pd.Remove(PdfName.JAVASCRIPT);
}
}
stamper.Close();
pdfReader.Close();
File.WriteAllBytes(rawfile, memoryStream.ToArray());
}