python - 使用 ElementTree 解析 xml

Question

我写了一个小函数，它使用 ElementTree 来解析 xml 文件，但它抛出以下错误“xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 1, column 0”。请在下面找到代码

tree = ElementTree.parse(urllib2.urlopen('http://api.ean.com/ean-services/rs/hotel/v3/list?type=xml&apiKey=czztdaxrhfbusyp685ut6g6v&cid=8123&locale=en_US&city=Dallas%20&stateProvinceCode=TX&countryCode=US&minorRev=12'))

rootElem = tree.getroot()

hotel_list = rootElem.findall("HotelList")

score 6 · Accepted Answer

您使用的网站存在多个问题：

您正在使用的网站以某种方式不尊重type=xml您作为 GET arg 发送，而是您需要发送接受标头，告诉网站您接受 XML，否则它返回 JSON 数据
网站不接受内容类型text/xml，因此您需要发送application/xml
您的parse调用是正确的，在其他答案中错误地提到它应该获取数据，而不是parse获取文件名或文件类型对象

所以这是工作代码

import urllib2
from xml.etree import ElementTree

url = 'http://api.ean.com/ean-services/rs/hotel/v3/list?type=xml&apiKey=czztdaxrhfbusyp685ut6g6v&cid=8123&locale=en_US&city=Dallas%20&stateProvinceCode=TX&countryCode=US&minorRev=12'
request = urllib2.Request(url, headers={"Accept" : "application/xml"})
u = urllib2.urlopen(request)
tree = ElementTree.parse(u)
rootElem = tree.getroot()
hotel_list = rootElem.findall("HotelList")  
print hotel_list

输出：

[<Element 'HotelList' at 0x248cd90>]

注意我正在创建一个Request对象并传递Accept标题

顺便说一句，如果站点返回 JSON，为什么您需要解析 XML，解析 JSON 会更简单，您将获得一个现成的 python 对象。

python - 使用 ElementTree 解析 xml

1 回答 1

Related

Reference