-4

文档总是写得很糟糕,示例更有帮助。

这是我的 xml 文件:

<wordbook>
  <item>
    <name>engrossment</name>
    <phonetic><![CDATA[ɪn'grəʊsmənt]]></phonetic>
    <meaning><![CDATA[n. 正式缮写的文件,专注]]></meaning>
  </item>
  <item>
    <name>graffiti</name>
    <phonetic><![CDATA[ɡrəˈfi:ti:]]></phonetic>
    <meaning><![CDATA[n.在墙上的乱涂乱写(复数形式)]]></meaning>
  </item>
  <item>
    <name>pathology</name>
    <phonetic><![CDATA[pæˈθɔlədʒi:]]></phonetic>
    <meaning><![CDATA[n. 病理(学);〈比喻〉异常状态]]></meaning>
  </item>
<wordbook>

这是我的python类:

class Item(Base):
    name = Column(String(50), primary_key=True)
    phonetic = Column(String(50), default='')
    meaning = Column(UnicodeText, nullable=False)

选择你喜欢的xml解析器


最后,我用xmltodict来解析,lxml来写:

from lxml import etree

wordbook = etree.Element('wordbook')
for one in items:
    item = etree.Element('item')
    name = etree.Element('name')
    name.text = one.name
    phonetic = etree.Element('phonetic')
    phonetic.text = etree.CDATA(one.phonetic)
    meaning = etree.Element('meaning')
    meaning.text = etree.CDATA(one.meaning)
    if 1:
        item.append(name)
        item.append(phonetic)
        item.append(meaning)
    wordbook.append(item)
s = etree.tostring(wordbook, pretty_print=True, encoding='utf8')
print s
4

1 回答 1

2

我会使用xmltodict

# -*- coding: utf-8 -*-
import xmltodict

data = """<wordbook>
  <item>
    <name>engrossment</name>
    <phonetic><![CDATA[ɪn'grəʊsmənt]]></phonetic>
    <meaning><![CDATA[n. 正式缮写的文件,专注]]></meaning>
  </item>
  <item>
    <name>graffiti</name>
    <phonetic><![CDATA[ɡrəˈfi:ti:]]></phonetic>
    <meaning><![CDATA[n.在墙上的乱涂乱写(复数形式)]]></meaning>
  </item>
  <item>
    <name>pathology</name>
    <phonetic><![CDATA[pæˈθɔlədʒi:]]></phonetic>
    <meaning><![CDATA[n. 病理(学);〈比喻〉异常状态]]></meaning>
  </item>
</wordbook>"""

data = xmltodict.parse(data, encoding='utf-8')

for item in data['wordbook']['item']:
    print item['name']

印刷:

engrossment
graffiti
pathology

您也可以使用BeautifulSouplxml - 这是一个口味问题。这个想法几乎相同 - 迭代item标签并在循环中实例化Item

希望有帮助。

于 2013-08-23T09:53:30.807 回答