0

我尝试使用 pd.read_html 来抓取表格,但最后 3 列返回“nan”。这是我使用的代码:

import pandas as pd

url = 'https://www.actionnetwork.com/mlb/public-betting'

todays_games = pd.read_html(url)[0]

总共有 7 列,它抓取所有标题,但不抓取最后 3 列中的数据。我也尝试使用 BeautifulSoup 解析它,但得到了相同的结果。

print(todays_games)

                                                Scheduled      Open  ... Diff Bets
    0                5:05 PM 951MarlinsMIA952NationalsWSH  -118+100  ...  NaN  NaN
    1                   5:10 PM 979BrewersMIL980TigersDET  -227+188  ...  NaN  NaN
    2                    7:07 PM 965RaysTB966Blue JaysTOR  +150-175  ...  NaN  NaN
    3                 8:10 PM 967Red SoxBOS968MarinersSEA  -125+105  ...  NaN  NaN
    4                    10:35 PM 953RedsCIN954PiratesPIT  -154+135  ...  NaN  NaN
    5                   11:05 PM 955CubsCHC956PhilliesPHI  +170-200  ...  NaN  NaN
    6                 11:05 PM 969YankeesNYY970OriolesBAL  -227+188  ...  NaN  NaN
    7                  11:10 PM 957CardinalsSTL958MetsNYM  +135-154  ...  NaN  NaN
    8                  11:20 PM 959RockiesCOL960BravesATL  +170-200  ...  NaN  NaN
    9                   11:40 PM 971IndiansCLE972TwinsMIN  +100-118  ...  NaN  NaN
    10       Thu 9/16, 12:05 AM 973AstrosHOU974RangersTEX  -213+175  ...  NaN  NaN
    11     Thu 9/16, 12:10 AM 975AngelsLAA976White SoxCWS  +160-189  ...  NaN  NaN
    12      Thu 9/16, 12:10 AM 977AthleticsOAK978RoyalsKC  -149+125  ...  NaN  NaN
    13           Thu 9/16, 1:45 AM 961PadresSD962GiantsSF  +103-120  ...  NaN  NaN
    14  Thu 9/16, 2:10 AM 963DiamondbacksARI964DodgersLAD  -185+155  ...  NaN  NaN

我假设问题与 HTML 代码有关。谁能帮我解决这个问题?

4

1 回答 1

0

发送 HTTP GET 到https://api.actionnetwork.com/web/v1/scoreboard/mlb?bookIds=15,30,68,75,69,76,71,79,247,123,263&date=20210915

并获取您正在寻找的数据。

import requests

r = requests.get(
    'https://api.actionnetwork.com/web/v1/scoreboard/mlb?bookIds=15,30,68,75,69,76,71,79,247,123,263&date=20210915')
print(r.json())
于 2021-09-15T15:37:46.297 回答