python - python feed解析器

Question

你将如何使用 python feedparser 解析 xml 数据如下

<Book_API>
<Contributor_List>
<Display_Name>Jason</Display_Name>
</Contributor_List>
<Contributor_List>
<Display_Name>John Smith</Display_Name>
</Contributor_List>
</Book_API>

score 4 · Accepted Answer

这看起来不像任何类型的 RSS/ATOM 提要。我根本不会使用 feedparser，我会使用 lxml。实际上，feedparser 无法理解它，并在您的示例中删除了“Jason”贡献者。

from lxml import etree

data = <fetch the data somehow>
root = etree.parse(data)

现在你有一个 xml 对象树。在您实际提供有效的 XML 数据之前，无法更具体地说明如何在 lxml 中执行此操作。;)

score 2 · Accepted Answer

正如 Lennart Regebro 所提到的，它似乎不是 RSS/Atom 提要，而只是 XML 文档。Python 标准库中有几个 XML 解析工具（SAX 和 DOM）。我推荐你ElementTree。lxml也是第三方库中最好的一个（它是 ElementTree 的替代品）。

try:
    from lxml import etree
except ImportError:
    try:
        from xml.etree.cElementTree as etree
    except ImportError:
        from xml.etree.ElementTree as etree

doc = """<Book_API>
<Contributor_List>
<Display_Name>Jason</Display_Name>
</Contributor_List>
<Contributor_List>
<Display_Name>John Smith</Display_Name>
</Contributor_List>
</Book_API>"""
xml_doc = etree.fromstring(doc)

python - python feed解析器

2 回答 2

Related

Reference