0

我所拥有的:<xliff:g>xml 文件中带有标记的行,例如:

<string name="AAAAAAA" msgid="XXXXXXX">"Activity <xliff:g id="BBBBBBB">%1$s</xliff:g> isn\'t responding."\n\n"Do you want to close it?"</string>

我需要什么:读取整个字符串与以下内容相同:

Activity %1$s isn't responding.\n\nDo you want to close it?

你能帮忙吗?

我尝试使用 xml.dom.minidom。

dom = xml.dom.minidom.parse(xmlfile)
strings = dom.getElementsByTagName('string')
for string in strings:
    rText = string.childNodes[0].nodeValue
    print(rText)

结果是“活动

4

2 回答 2

0

您可以使用像BeautifulSoup这样的 XML 解析器,它非常易于使用(在我看来):

>>> myxml = "thexmlyouposted"
>>> from bs4 import BeautifulSoup as BS
>>> soup = BS(myxml, 'xml')
>>> print soup.find('string').text
"Activity %1$s isn't responding."

"Do you want to close it?"
于 2013-06-17T08:23:52.943 回答
0

我将假设该元素是更大文件的一部分。例如:

<strings xmlns:xliff="some-name-space">
  <string name="AAAAAAA" msgid="XXXXXXX">"Activity <xliff:g id="BBBBBBB">%1$s</xliff:g> isn\'t responding."\n\n"Do you want to close it?"</string>
  <string name="AAAAAAA" msgid="XXXXXXX">"Another <xliff:g id="BBBBBBB">%1$s</xliff:g>message</string>
</strings>

使用 minidom 与任何其他框架一样好。打开文件并遍历所有元素。对于每个元素调用函数get_text。获取下面定义的文本递归返回所有元素的内容(nodeValue)。

import xml.dom.minidom as md
dom = md.parse('wu.xml')
strings = dom.getElementsByTagName('string')
for string in strings:
    print get_text(string)

def get_text(el):
    """get_text
    For text nodes, returns the text. For element nodes, recursively call the
    function to aggregate all the text nodes into a string"""           
    msg = ''
    for n in el.childNodes:
        if n.nodeType == n.TEXT_NODE:
            msg += n.nodeValue
        elif n.nodeType == n.ELEMENT_NODE:
            msg += get_text(n)
    return msg

还有很多其他方法可以做到这一点。

于 2013-06-17T09:42:07.790 回答