python - 使用 PyMuPDF 进行 Python 文本注释

问问题 2021-02-01T16:19:44.477

429 次

我正在使用 PyMuPDF 在 . pdf文件使用：

import fitz 
import re 

def data_(text): 
        
        annotation_text = r"(amet)"
        for line in text:
            if re.search(annotation_text, line, re.IGNORECASE): 
                search = re.search(annotation_text, line, re.IGNORECASE) 
                yield search.group(1) 

    def includeannotation(path_included): 
        
        document = fitz.open(path_included) 
        
        
        for page in document: 
            page.wrap_contents() 
            obs = data_(page.getText("text") .split('\n'))
            #print (obs)
            for data in obs: 
                catchs = page.searchFor(data) 
                [page.addRedactAnnot(catchs, fontsize=11, fill = (0, 0, 0)) for catch in catchs] 
            page.apply_redactions() 
        doc.save('annotation.pdf') 
        print("end - created") 

path_included = '/content/document.pdf'

save_document=includeannotation(path_included)

源 .pdf 文档包含文本：通过应用上述代码，我可以包含文本“amet”的注释，获得以下结果：

结果似乎与预期相符，但您可以看到该库已包含黑色注释（用于“amet”）也删除了后面的行中的单词，但没有使用黑色注释。事实上，它看起来像是一个重新设计的问题。

我怎样才能避免这样的问题？

python - 使用 PyMuPDF 进行 Python 文本注释

0 回答 0

Related

Reference