3

I have an XML file that I am parsing through BeautifulSoup. A small portion of my file is:

<document>
    <ad>
        <date>21-Apr-2013</date>
    </ad>
    <ad>
        <date></date>
    </ad>
</document>

What is the fastest way to count the number of date elements that are not empty? Will this be faster if I convert date to an attribute of ad?

4

1 回答 1

3

这将计算空<date>标签:

sum(1 for s in soup.find_all('date') if s.text)

但是,如果您真的追求速度,请考虑使用其他解析器,例如SAX

要匹配属性,请使用find_all('ad', attrs={'date': ''})

于 2013-04-30T20:19:37.687 回答