0

假设我有一个像这样实例化的 expat 解析器:

def on_character_data(data):
    print(data)

parser = xml.parsers.expat.ParserCreate(encoding=encoding)
...
parser.CharacterDataHandler = on_character_data
...

还有一个像这样的 XML 文档:

<?xml version="1.0" encoding="UTF-8"?>
<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <meta content="text/html; charset=utf-8" http-equiv="Content-Type"/>
  </head>
<body>
  ampersands &amp; other annoyances
</body>
</html>

如果我调用parser.Parse(test_xml_string)处理程序on_character_data()将接收与替换ampersands &amp; other annoyances为的字符串。我希望 expat 忽略这些实体,以便接收未修改的. 有什么办法可以做到这一点吗?ampersands & other annoyances&amp;&on_character_data()ampersands &amp; other annoyances

4

0 回答 0