I can't make this script to run all urls from list in one time.
将您的代码保存在具有一个参数的方法中,*args
(或任何您想要的名称,只是不要忘记*
)。将*
自动解压缩您的列表。没有正式名称*
,但是,有些人(包括我)喜欢称它为splat 运算符。
def start_download(*args):
for value in args:
##for debugging purposes
##print value
response = urllib2.urlopen(value).read()
##put the rest of your code here
if __name__ == '__main__':
links = ['http://guardsmanbob.com/media/playlist.php?char='+
chr(i) for i in range(97,123)]
start_download(links)
编辑:
或者您可以直接遍历您的链接列表并下载每个链接。
links = ['http://guardsmanbob.com/media/playlist.php?char='+
chr(i) for i in range(97,123)]
for link in links:
response = urllib2.urlopen(link).read()
##put the rest of your code here
编辑2:
为了获取所有链接,然后将它们保存在文件中,这是带有特定注释的整个代码:
import urllib2
from bs4 import BeautifulSoup, SoupStrainer
links = ['http://guardsmanbob.com/media/playlist.php?char='+
chr(i) for i in range(97,123)]
for link in links:
response = urllib2.urlopen(link).read()
## gets all <a> tags
soup = BeautifulSoup(response, parse_only=SoupStrainer('a'))
## unnecessary link texts to be removed
not_included = ['News', 'FAQ', 'Stream', 'Chat', 'Media',
'League of Legends', 'Forum', 'Latest', 'Wallpapers',
'Links', 'Playlist', 'Sessions', 'BobRadio', 'All',
'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J',
'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T',
'U', 'V', 'W', 'X', 'Y', 'Z', 'Misc', 'Play',
'Learn more about me', 'Chat info', 'Boblights',
'Music Playlist', 'Official Facebook',
'Latest Music Played', 'Muppets - Closing Theme',
'Billy Joel - The River Of Dreams',
'Manic Street Preachers - If You Tolerate This
Your Children Will Be Next',
'The Bravery - An Honest Mistake',
'The Black Keys - Strange Times',
'View whole playlist', 'View latest sessions',
'Referral Link', 'Donate to BoB',
'Guardsman Bob', 'Website template',
'Arcsin']
## create a file named "test.txt"
## write to file and close afterwards
with open("test.txt", 'w') as output:
for hyperlink in soup:
if hyperlink.text:
if hyperlink.text not in not_included:
##print hyperlink.text
output.write("%s\n" % hyperlink.text.encode('utf-8'))
这是保存在的输出test.txt
:
我建议您test.txt
每次循环链接列表时更改为不同的文件名(例如 S 歌曲标题),因为它会覆盖前一个。