I am new to using BeautifulSoup and am try to use it to grab some test data from NHL.com. Here is my code so far but I am pretty lost...
Here is a snippet of the HTML code I want to extract data from:
<tr>
<td rowspan="1" colspan="1"> … </td>
<td style="text-align: left;" rowspan="1" colspan="1">
<a href="/ice/player.htm?id=8474564">
Steven Stamkos
</a>
</td>
<td style="text-align: center;" rowspan="1" colspan="1">
<a href="javascript:void(0);" rel="TBL" onclick="loadTeamSpotlight(jQuery(this));" style="border-bottom:1px dotted;">
TBL
</a>
</td>
<td style="text-align: center;" rowspan="1" colspan="1">
C
</td>
<td style="center" rowspan="1" colspan="1">
16
</td>
<td style="center" rowspan="1" colspan="1">
14
</td>
<td style="center" rowspan="1" colspan="1">
9
</td>
I would like to extract data from these fields for the entire page, so there are about 30 different table rows. Here is my Python code so far, I'm not really sure where to go.
from bs4 import BeautifulSoup
import requests
r = requests.get("http://www.nhl.com/ice/playerstats.htm?fetchKey=20142ALLSASAll&viewName=summary&sort=points&pg=1")
data = r.text
t_data=[]
soup = BeautifulSoup(data)
table = soup.find('table', {'class': 'data stats'})
I know it isn't much but I have no idea how to go about this. Thanks for the help everyone
EDIT: I solved the problem, and hopefully this will help anyone in the future. Here is my code:
from bs4 import BeautifulSoup
import requests
r = requests.get("http://www.nhl.com/ice/playerstats.htm?fetchKey=20142ALLSASAll&viewName=summary&sort=points&pg=1")
player=[]
team=[]
goals=[]
assists=[]
cells=[]
points=[]
i=0
data = r.text
soup = BeautifulSoup(data)
table = soup.find('table', {'class': 'data stats'})
row=[]
for rows in table.find_all('tr'):
cells=rows.find_all('td')
if(len(cells)==19):
player.append(cells[1].find(text=True))
team.append(cells[2].find(text=True))
goals.append(cells[5].find(text=True))
assists.append(cells[6].find(text=True))
points.append(cells[7].find(text=True))
print(player[i],team[i],goals[i],assists[i],points[i])
i=i+1