python - 解析 XML 时所有 nodeValue 字段均为 None

Question

我正在用 Python 构建一个简单的基于 Web 的 RSS 阅读器，但在解析 XML 时遇到了问题。我首先在 Python 命令行中尝试了一些东西。

>>> from xml.dom import minidom
>>> import urllib2 
>>> url ='http://www.digg.com/rss/index.xml'
>>> xmldoc = minidom.parse(urllib2.urlopen(url))
>>> channelnode = xmldoc.getElementsByTagName("channel")
>>> channelnode = xmldoc.getElementsByTagName("channel")
>>> titlenode = channelnode[0].getElementsByTagName("title")
>>> print titlenode[0]
<DOM Element: title at 0xb37440> 
>>> print titlenode[0].nodeValue 
None

我玩了一段时间，但nodeValue一切似乎都是None. 但是，如果您查看 XML，那里肯定有值。我究竟做错了什么？

score 17 · Accepted Answer

对于 RSS 提要，您应该尝试通用提要解析器库。它极大地简化了 RSS 提要的处理。

import feedparser
d = feedparser.parse('http://www.digg.com/rss/index.xml')
title = d.channel.title

score 10 · Accepted Answer

这是您正在寻找的语法：

>>> print titlenode[0].firstChild.nodeValue
digg.com: Stories / Popular

请注意，节点值是节点本身的逻辑后代。

python - 解析 XML 时所有 nodeValue 字段均为 None

2 回答 2

Related

Reference