3

我使用了下面的脚本并提取了一个 url 列表:

request = urllib2.Request("http://www.dummyurl.com")
pub_lv1 = urllib2.urlopen(request)
pub_lv1_parse = BeautifulSoup(pub_lv1)
pub_lv1_parse = pub_lv1_parse.body.find('table', attrs={"class":"proxy-archive-content-year-list"})
pub_lv1_parse = pub_lv1_parse.findAll('a')

输出如下:

[<a href="/content/by/year/2011">2011</a>,
 <a href="/content/by/year/2012">2012</a>,
 <a href="/content/by/year/2013">2013</a>,
 <a href="/content/by/year/2000">2000</a>,
 <a href="/content/by/year/2001">2001</a>,
 <a href="/content/by/year/2002">2002</a>,
 <a href="/content/by/year/2003">2003</a>,
 <a href="/content/by/year/2004">2004</a>,
 <a href="/content/by/year/2005">2005</a>]

如您所见,year未排序,我想对它们进行排序,我知道如何使用对字符串列表进行排序,sort但是输出beautifulsoup呢?

4

1 回答 1

5

按元素文本排序:

sorted(pub_lv1_parse, key=lambda elem: elem.text)
于 2013-06-20T06:30:06.053 回答