xml-parsing - 使用 Python3 从 XML 中提取元素？

Question

我正在尝试编写一个 Python 3 脚本，在其中查询 Web api 并接收 XML 响应。响应看起来像这样 -</p>

<?xml version="1.0" encoding="UTF-8"?>
<ipinfo>
   <ip_address>4.2.2.2</ip_address>
   <ip_type>Mapped</ip_type>
   <anonymizer_status/>
   <Network>
      <organization>level 3 communications  inc.</organization>
      <OrganizationData>
     <home>false</home>
         <organization_type>Telecommunications</organization_type>
         <naics_code>518219</naics_code>
     <isic_code>J6311</isic_code>
      </OrganizationData>      
      <carrier>level 3 communications</carrier>
      <asn>3356</asn>
      <connection_type>tx</connection_type>
      <line_speed>high</line_speed>
      <ip_routing_type>fixed</ip_routing_type>
      <Domain>
         <tld>net</tld>
         <sld>bbnplanet</sld>
      </Domain>
   </Network>
   <Location>
      <continent>north america</continent>
      <CountryData>
         <country>united states</country>
         <country_code>us</country_code>
         <country_cf>99</country_cf>
      </CountryData>
      <region>southwest</region>
      <StateData>
         <state>california</state>
         <state_code>ca</state_code>
         <state_cf>88</state_cf>
      </StateData>
      <dma>803</dma>
      <msa>31100</msa>
      <CityData>
         <city>san juan capistrano</city>
         <postal_code>92675</postal_code>
         <time_zone>-8</time_zone>
         <area_code>949</area_code>
         <city_cf>77</city_cf>
      </CityData>
      <latitude>33.499</latitude>
      <longitude>-117.662</longitude>
   </Location>
</ipinfo>

这是我目前的代码——</p>

import urllib.request
import urllib.error 
import sys
import xml.etree.ElementTree as etree

…

try:
    xml = urllib.request.urlopen(targetURL, data=None)
except urllib.error.HTTPError as e:
    print("HTTP error: " + str(e) + " URL: " + targetURL)
    sys.exit()

tree = etree.parse(xml)
root = tree.getroot()

API 查询有效，通过调试器，我可以看到“根”变量中的所有信息。我的问题是我无法弄清楚如何<asn></asn>从返回的 XML 中提取 ASN ( ) 之类的内容。一天来，我一直在用各种各样的发现、findalls 和所有其他类型的方法来解决这个问题，但无法破解这个问题。我想我已经到了看不到所有树木的木材的地步，而且我在互联网上找到的每个例子似乎都没有帮助。有人可以给我看一个可以从树结构中提取 XML 元素内容的代码片段吗？

非常感谢

蒂姆

score -1 · Accepted Answer

I would recommend using Beautiful Soup.

It's a very powerful when it comes to extracting data from xml-code.

Example:

from bs4 import BeautifulSoup
soup = BeautifulSoup(targetURL)

soup.find_all('asn') #Would return all the <asn></asn> tags found!

xml-parsing - 使用 Python3 从 XML 中提取元素？

1 回答 1

Related

Reference