120

我正在用 Python 为 Eve Online API 创建一个 GUI 前端。

我已经成功地从他们的服务器中提取了 XML 数据。

我正在尝试从名为“name”的节点中获取值:

from xml.dom.minidom import parse
dom = parse("C:\\eve.xml")
name = dom.getElementsByTagName('name')
print name

这似乎找到了节点,但输出如下:

[<DOM Element: name at 0x11e6d28>]

我怎样才能让它打印节点的值?

4

9 回答 9

180

它应该只是

name[0].firstChild.nodeValue
于 2008-11-25T13:59:13.823 回答
60

如果它是你想要的文本部分,可能是这样的......

from xml.dom.minidom import parse
dom = parse("C:\\eve.xml")
name = dom.getElementsByTagName('name')

print " ".join(t.nodeValue for t in name[0].childNodes if t.nodeType == t.TEXT_NODE)

节点的文本部分被认为是一个节点,它本身被放置为您要求的节点的子节点。因此,您将需要遍历其所有子节点并找到所有作为文本节点的子节点。一个节点可以有多个文本节点;例如。

<name>
  blabla
  <somestuff>asdf</somestuff>
  znylpx
</name>

你想要'blabla'和'znylpx';因此“”.join()。您可能想用换行符左右替换空格,或者什么也不替换。

于 2008-11-25T14:21:08.693 回答
12

you can use something like this.It worked out for me

doc = parse('C:\\eve.xml')
my_node_list = doc.getElementsByTagName("name")
my_n_node = my_node_list[0]
my_child = my_n_node.firstChild
my_text = my_child.data 
print my_text
于 2011-01-29T07:28:23.907 回答
9

I know this question is pretty old now, but I thought you might have an easier time with ElementTree

from xml.etree import ElementTree as ET
import datetime

f = ET.XML(data)

for element in f:
    if element.tag == "currentTime":
        # Handle time data was pulled
        currentTime = datetime.datetime.strptime(element.text, "%Y-%m-%d %H:%M:%S")
    if element.tag == "cachedUntil":
        # Handle time until next allowed update
        cachedUntil = datetime.datetime.strptime(element.text, "%Y-%m-%d %H:%M:%S")
    if element.tag == "result":
        # Process list of skills
        pass

I know that's not super specific, but I just discovered it, and so far it's a lot easier to get my head around than the minidom (since so many nodes are essentially white space).

For instance, you have the tag name and the actual text together, just as you'd probably expect:

>>> element[0]
<Element currentTime at 40984d0>
>>> element[0].tag
'currentTime'
>>> element[0].text
'2010-04-12 02:45:45'e
于 2010-04-13T00:18:27.940 回答
8

The above answer is correct, namely:

name[0].firstChild.nodeValue

However for me, like others, my value was further down the tree:

name[0].firstChild.firstChild.nodeValue

To find this I used the following:

def scandown( elements, indent ):
    for el in elements:
        print("   " * indent + "nodeName: " + str(el.nodeName) )
        print("   " * indent + "nodeValue: " + str(el.nodeValue) )
        print("   " * indent + "childNodes: " + str(el.childNodes) )
        scandown(el.childNodes, indent + 1)

scandown( doc.getElementsByTagName('text'), 0 )

Running this for my simple SVG file created with Inkscape this gave me:

nodeName: text
nodeValue: None
childNodes: [<DOM Element: tspan at 0x10392c6d0>]
   nodeName: tspan
   nodeValue: None
   childNodes: [<DOM Text node "'MY STRING'">]
      nodeName: #text
      nodeValue: MY STRING
      childNodes: ()
nodeName: text
nodeValue: None
childNodes: [<DOM Element: tspan at 0x10392c800>]
   nodeName: tspan
   nodeValue: None
   childNodes: [<DOM Text node "'MY WORDS'">]
      nodeName: #text
      nodeValue: MY WORDS
      childNodes: ()

I used xml.dom.minidom, the various fields are explained on this page, MiniDom Python.

于 2016-07-20T12:15:18.943 回答
3

Here is a slightly modified answer of Henrik's for multiple nodes (ie. when getElementsByTagName returns more than one instance)

images = xml.getElementsByTagName("imageUrl")
for i in images:
    print " ".join(t.nodeValue for t in i.childNodes if t.nodeType == t.TEXT_NODE)
于 2011-10-24T18:51:35.223 回答
2

I had a similar case, what worked for me was:

name.firstChild.childNodes[0].data

XML is supposed to be simple and it really is and I don't know why python's minidom did it so complicated... but it's how it's made

于 2011-10-06T03:10:30.693 回答
2

The question has been answered, my contribution consists in clarifying one thing that may confuse beginners:

Some of the suggested and correct answers used firstChild.data and others used firstChild.nodeValue instead. In case you are wondering what is the different between them, you should remember they do the same thing because nodeValue is just an alias for data.

The reference to my statement can be found as a comment on the source code of minidom:

#nodeValue is an alias for data

于 2018-04-18T08:52:08.877 回答
1

It's a tree, and there may be nested elements. Try:

def innerText(self, sep=''):
    t = ""
    for curNode in self.childNodes:
        if (curNode.nodeType == Node.TEXT_NODE):
            t += sep + curNode.nodeValue
        elif (curNode.nodeType == Node.ELEMENT_NODE):
            t += sep + curNode.innerText(sep=sep)
    return t
于 2020-04-03T22:37:54.927 回答