python - 如何在 Google App Engine 上用 Python 解析 xml

Question

对于以下 xml，我如何获取 xml 然后解析它以获取值<age>？

<boardgames>
  <boardgame objectid="13">
  <yearpublished>1995</yearpublished>
  <minplayers>3</minplayers>
  <maxplayers>4</maxplayers>
  <playingtime>90</playingtime>
  <age>10</age>
  <name sortindex="1">Catan</name>
  ...

我目前正在尝试：

result = urlfetch.fetch(url=game_url)
xml = ElementTree.fromstring(result.content)

但我不确定我是否走在正确的道路上。当我尝试解析时出现错误（我认为是因为 xml 不是有效的 xml）。

score 7 · Accepted Answer

xml.findtext('age')或者xml.findtext('boardgames/age')通常会给你 10 inside <age>10</age>，但是由于无效的 xml，解析似乎失败了。ElementTree根据我的经验，在解析无效 xml 方面做得很差。

而是使用BeautifulSoup，它可以很好地处理无效的 xml。

content = urllib2.urlopen('http://boardgamegeek.com/xmlapi/boardgame/13').read()
soup = BeautifulSoup(content)
print soup.find('age').string

score 2 · Accepted Answer

以下对我有用：

import urllib2
from xml.etree import ElementTree

result = urllib2.urlopen('http://boardgamegeek.com/xmlapi/boardgame/13').read()
xml = ElementTree.fromstring(result)
print xml.findtext(".//age")

python - 如何在 Google App Engine 上用 Python 解析 xml

2 回答 2

Related

Reference