我有以下代码从https://www.nba.com/stats/上的表中抓取数据
import pandas as pd
import requests
import json
headers = {'Host': 'stats.nba.com','User-Agent': 'Firefox/55.0','Accept': 'application/json, text/plain, */*','Accept-Language': 'en-US,en;q=0.5','Accept-Encoding': 'gzip, deflate','Referer': 'https://stats.nba.com/','x-nba-stats-origin': 'stats','x-nba-stats-token': 'true','DNT': '1',}
url = 'https://stats.nba.com/stats/leaguedashplayerstats?College=&Conference=&Country=&DateFrom=&DateTo=&Division=&DraftPick=&DraftYear=&GameScope=&GameSegment=&Height=&LastNGames=0&LeagueID=00&Location=&MeasureType=Base&Month=0&OpponentTeamID=0&Outcome=&PORound=0&PaceAdjust=N&PerMode=PerGame&Period=0&PlayerExperience=&PlayerPosition=&PlusMinus=N&Rank=N&Season=2020-21&SeasonSegment=&SeasonType=Regular+Season&ShotClockRange=&StarterBench=&TeamID=0&TwoWay=0&VsConference=&VsDivision=&Weight='
json = requests.get(url, headers=headers).json() # the line which the code never gets past
data = json['resultSets'][0]['rowSet']
columns = json['resultSets'][0]['headers']
df = pd.DataFrame.from_records(data, columns=columns)
此代码用于从站点上的数据正确创建 DataFrame,但它不再这样做,也不会输出错误。我使用 JupyterLab 运行代码,它只是无限运行。
我的猜测是标题在某种程度上已经过时了,但我不确定如何去更新它们。