我正在尝试使用以下 python 代码将元数据写入 pdf 文件:
from Foundation import *
from Quartz import *
url = NSURL.fileURLWithPath_("test.pdf")
pdfdoc = PDFDocument.alloc().initWithURL_(url)
assert pdfdoc, "failed to create document"
print "reading pdf file"
attrs = {}
attrs[PDFDocumentTitleAttribute] = "THIS IS THE TITLE"
attrs[PDFDocumentAuthorAttribute] = "A. Author and B. Author"
PDFDocumentTitleAttribute = "test"
pdfdoc.setDocumentAttributes_(attrs)
pdfdoc.writeToFile_("mynewfile.pdf")
print "pdf made"
这似乎工作正常(控制台没有错误),但是当我检查文件的元数据时,如下所示:
PdfID0:
242b7e252f1d3fdd89b35751b3f72d3
PdfID1:
242b7e252f1d3fdd89b35751b3f72d3
NumberOfPages: 4
原始文件具有以下元数据:
InfoKey: Creator
InfoValue: PScript5.dll Version 5.2.2
InfoKey: Title
InfoValue: Microsoft Word - PROGRESS ON THE GABION HOUSE Compressed.doc
InfoKey: Producer
InfoValue: GPL Ghostscript 8.15
InfoKey: Author
InfoValue: PWK
InfoKey: ModDate
InfoValue: D:20101021193627-05'00'
InfoKey: CreationDate
InfoValue: D:20101008152350Z
PdfID0: d5fd6d3960122ba72117db6c4d46cefa
PdfID1: 24bade63285c641b11a8248ada9f19
NumberOfPages: 4
所以问题是,它不是附加元数据,而是清除以前的元数据结构。我需要做什么才能使其正常工作?我的目标是附加参考管理系统可以导入的元数据。