我使用Python 2.7和Beautiful Soup 3.2,我得到了以下刮板来获取流 URL:
# Import the classes that are needed
import urllib2
from BeautifulSoup import BeautifulSoup
# URL to scrape and open it with the urllib2
url = 'http://www.wiziwig.tv/broadcast.php?matchid=219751&part=sports'
source = urllib2.urlopen(url)
# Turn the saved source into a BeautifulSoup object
soup = BeautifulSoup(source)
for tr in soup.findAll('tr', {'class': ['broadcast']}):
stationName = tr.findAll('td')[1].text
for trBelow in tr.findAllNext('tr'):
curClass = trBelow['class']
if curClass == 'broadcast':
break
kindStream = trBelow.findAll('td')[0].text
streamUrl = trBelow.find('a', {'class': 'broadcast go'})['href']
streamQuality = trBelow.findAll('td')[2].text
streamRating = trBelow.find('div', {'class': 'rating'})['rel']
print stationName, kindStream, streamQuality, streamRating, streamUrl
这是完美的工作,并提供以下输出:
BWIN Flash 650 Kbps 100 http://forum.wiziwig.eu/threads/1847-BWIN-Info
BWIN Flash 675 Kbps 100 https://sports.bwin.com/en/sports?wm=3448325&zoneId=1068792
Bet365 Flash 650 Kbps 100 http://forum.wiziwig.eu/threads/6258-Bet365
Bet365 Flash 675 Kbps 100 http://www.bet365.com/?affiliate=365_014110
TRK Ukraine+ AceStream 1250 Kbps 100 acestream://94879770520f2e9db2146d0eca59204bfbd72cbe
TRK Ukraine+ AceStream 1251 Kbps 75 http://aviatortv.org/football_ua_plus/
Arenavision1 Sopcast 2000 Kbps 75 sop://broker.sopcast.com:3912/143876
Arenavision3 AceStream 2000 Kbps 75 acestream://a53a380706846bfc6667e21a1485dedb78b9674b
Arenavision3 AceStream 2001 Kbps 75 http://avod.me/play/a53a380706846bfc6667e21a1485dedb78b9674b
Dazsports Ace2 AceStream 850 Kbps 100 acestream://d293c82146aa6c2904e45ff305ae0f38dc5b329d
Dazsports Ace2 AceStream 851 Kbps 75 http://dazsports.org/ace2.html
Digi Sport1 [RO] Sopcast 1500 Kbps 100 sop://broker.sopcast.com:3912/146141
Digi Sport1 [RO] Sopcast 1500 Kbps 100 sop://broker.sopcast.com:3912/124992
Digi Sport1 [RO] Sopcast 1501 Kbps 100 sop://broker.sopcast.com:3912/139777
Digi Sport1 [RO] Sopcast 1501 Kbps 100 sop://broker.sopcast.com:3912/110152
Pole Position1 [NL] AceStream 1000 Kbps 100 acestream://86fd521d30e9319198b75121761eccf260fef0cb
Pole Position1 [NL] AceStream 1001 Kbps 75 http://polepositionweb.org/?page_id=6 popup
Solodeportes Veetle Veetle 850 Kbps 100 http://veetle.com/index.php/widget/index/E47CFF6CB6A770852515B8B30C2E30F6/0/true/default/false
Livesports4u4 Flash 225 Kbps 75 http://livesport4u.com/stream4.html
Cricfree Flash2 Flash 175 Kbps 75 http://cricfree.tv/live-golf-streaming-ch2.php
Njtvx9 Flash 175 Kbps 75 http://nutjob.eu/njtvx9.html
Igoal C+ Liga Flash 175 Kbps 75 http://ana1.me/liga+.html
Soccertoall2 [PT] Flash 175 Kbps 75 http://soccertoall.net/index.php?channel=2
Tugalive1 Flash 175 Kbps 75 http://www.tugalive.eu/p/live-1.html
Diresport1 Flash 175 Kbps 75 http://diresportt.blogspot.com.es/
Footstream11 Flash 175 Kbps 75 http://www.footstream.tv/channel11.html
Lag10 (8) Flash 150 Kbps 50 http://lag10.com/channel8
ANA STV2 Flash 400 Kbps 75 http://ana1.me/STV2.html
ANA STV2 Flash 400 Kbps 75 http://bliner.tv/sporttv2pt.html
Livesoccerhd4 Flash 225 Kbps 75 http://livesoccerhd.tv/l4.html
Stvstreams Ace HD1 AceStream 1500 Kbps 100 acestream://750acfc788e12220dbd57188505eae08f566281e
Stvstreams Ace HD1 AceStream 1500 Kbps 100 http://stvstreams.com/acestreams/stv-hd/
Btsportshd12 Flash 200 Kbps 75 http://www.btsportshd.com/stream12.php
Ana Stream1 Flash 175 Kbps 75 http://ana3.me/STREAM1.html
Onlinesoccer2all (13) Flash 175 Kbps 75 http://online--soccer.eu/channel13.html
Hdfoots6 Flash 175 Kbps 75 http://hdfoots.com/stream6.html
但我想知道我是否应该这样做,或者是否有更好的方法而不进行下一个循环for trBelow in tr.findAllNext('tr'):
,然后在它到达特定类时打破它?