2

我正在尝试从 XML 文档中提取转义节点。节点的原始文本如下所示:

<Notes>{&quot;Phase&quot;: 0, &quot;Flipper&quot;: 0, &quot;Guide&quot;: 0,     
&quot;Sample&quot;: 0, &quot;Triangle8&quot;: 0, &quot;Triangle5&quot;: 0,     
&quot;Triangle4&quot;: 0, &quot;Triangle7&quot;: 0, &quot;Triangle6&quot;: 0,     
&quot;Triangle1&quot;: 0, &quot;Triangle3&quot;: 0, &quot;Triangle2&quot;: 0}</Notes> 

我将文本拉出如下:

infile = ET.parse("C:/userfiles/EXP011/SESAME_60/SESAME_60_runinfo.xml")
r = infile.getroot()
XMLNS = "{http://example.com/foo/bar/runinfo_v4_3}"
x=r.find(".//"+XMLNS+"Notes")
print(x.text)

我希望得到:

{"Phase": 0, "Flipper": 0, "Guide&quot": 0,     
"Sample": 0, "Triangle8": 0, "Triangle5": 0,     
"Triangle4": 0, "Triangle7": 0, "Triangle6": 0,     
"Triangle1": 0, "Triangle3": 0, "Triangle2": 0}

但是,相反,我得到了:

 {&quot;Phase&quot;: 0, &quot;Flipper&quot;: 0, &quot;Guide&quot;: 0,      
 &quot;Sample&quot;: 0, &quot;Triangle8&quot;: 0, &quot;Triangle5&quot;: 0,   
 &quot;Triangle4&quot;: 0, &quot;Triangle7&quot;: 0, &quot;Triangle6&quot;: 0, 
 &quot;Triangle1&quot;: 0, &quot;Triangle3&quot;: 0, &quot;Triangle2&quot;: 0}

如何获取未转义的字符串?

4

3 回答 3

6

使用HTMLParser.HTMLParser()

In [8]: import HTMLParser    

In [11]: HTMLParser.HTMLParser().unescape('&quot;')
Out[11]: u'"'

saxutils 处理&lt;,&gt;&amp;,但它不处理&quot;

In [9]: import xml.sax.saxutils as saxutils

In [10]: saxutils.unescape('&quot;')
Out[10]: '&quot;'    
于 2012-09-10T17:25:42.650 回答
3

由于python 3.4您可以使用html.unescape.

>>> from html import unescape
>>> unescape('&quot;')
'"'
于 2019-02-20T14:25:55.307 回答
0

由于某种原因,我没有设法&quot;在 Python中使用转义符,但我找到了一种解决方法来获取 " 而不是使用以下函数在 XML 文件中:2.7.5&quot;replace

with open(xmlfilename, 'w') as f:
     f.write(myxml.toprettyxml().replace("&quot;",'"'))
于 2019-08-01T10:23:48.780 回答