1

我正在使用这个 Python XML 序列化库dexml。我不太清楚如何在我从对象生成的这个 xml 中的某些标签上放置属性。我通读了文档,除非我看不懂,否则我找不到一个很好的解释。

这是涉及的代码。

import dexml
import urllib2
from dexml import fields
from bs4 import BeautifulSoup

class Section(dexml.Model):
    section = fields.String()
    entries = fields.List(fields.String(tagname="Entry"))
    # Add something for href here, maybe?

class AtoZ(dexml.Model):
    list = fields.List(Section)

def makeSoup(url):
    return BeautifulSoup(urllib2.urlopen(url).read())

def main():
    soup = makeSoup("http://www.somewebsite.com")

    sectionList = []

    # You might wonder about the length of this; I *could* split it up
    # into variables to make it shorter. Also, the chaining is because
    # the 'li' I want are only inside of a <ul class="Nav_fm>".
    for li in soup.find('ul', {'class':"Nav_fm"}).find_all('li', {'class':"MenuLevel_0"}):
    atzSection = Section()
    atzSection.section = li.a.string

    for innerLi in li.find_all('li', {'class':"MenuLevel_1"}):
        atzSection.entries.append(innerLi.a.string)
        # Somehow store innlerLi.a['href'] in atzSection

    sectionList.append(atzSection)

    atzList = AtoZ(list=sectionList)

    f = open("C:\\atoz.xml", "w")
    f.write(atzList.render(pretty=True))
    f.close()

if __name__ == '__main__':
    main()

这是生成的 XML。

<?xml version="1.0" ?>
<AtoZ>
    <Section section="#">
        <Entry>...</Entry>
        <Entry>...</Entry>
        <Entry>...</Entry>
        <Entry>...</Entry>
    </Section>
    ...
    <Section section="Z">
        <Entry>...</Entry>
        <Entry>...</Entry>
        <Entry>...</Entry>
        <Entry>...</Entry>
    </Section>
</AtoZ>

我想<Entry href="...">...</Entry>为每个<Entry>.

4

1 回答 1

1

尝试将 Section.entries 重新定义为一个条目列表,如下所示:

class Entry(dexml.Model):
    href = fields.String() 
    ...

class Section(dexml.Model):
    section = fields.String()
    entries = fields.List(fields.Model(Entry), tagname='Entry')

查看 dexml测试代码- 除了文档描述之外,还有很多关于如何使用它的很好的说明。

于 2013-11-09T04:42:52.543 回答