我想使用 lxml/xpath 来查找某些 img 元素并将一个简短的 php 脚本写入它们的 src 属性。像这样:
from lxml import html
htmldoc = html.document_fromstring(htmlstr)
imgs = htmldoc.xpath("//*[@class='someclass']/img")
imgs[0].attrib['src'] = "<?php echo get_img_file(); ?>"
processedHTML = html.tostring(htmldoc, pretty_print=True)
with open("test.php","w+") as outfile:
outfile.write(processedHTML.decode("utf-8"))
非法字符(如 < 和 >)被转义为 html 实体。有没有办法设置 lxml 以允许将这些字符写入文档?谢谢!