我正在尝试解析以下链接右下角表格中的信息,该表格显示Current schedule submissions
:
dnedesign.us.to/tables/
我能够将其解析为:
{s:12:"cfdb7_status";s:6:"unread";s:3:"Day";s:6:"Sunday";s:9:"startTime";s:5:"14:30";s:7:"endTime";s:5:"16:30";}
{s:12:"cfdb7_status";s:6:"unread";s:3:"Day";s:6:"Sunday";s:9:"startTime";s:5:"14:30";s:7:"endTime";s:5:"15:30";}
{s:12:"cfdb7_status";s:6:"unread";s:3:"Day";s:6:"Sunday";s:9:"startTime";s:5:"16:30";s:7:"endTime";s:5:"18:30";}
{s:12:"cfdb7_status";s:6:"unread";s:3:"Day";s:6:"Sunday";s:9:"startTime";s:0:"";s:7:"endTime";s:0:"";}
{s:12:"cfdb7_status";s:6:"unread";s:3:"Day";s:6:"Sunday";s:9:"startTime";s:0:"";s:7:"endTime";s:0:"";}
{s:12:"cfdb7_status";s:6:"unread";s:3:"Day";s:6:"Sunday";s:9:"startTime";s:5:"12:30";s:7:"endTime";s:5:"16:30";}
{s:12:"cfdb7_status";s:6:"unread";s:3:"Day";s:6:"Sunday";s:9:"startTime";s:5:"12:30";s:7:"endTime";s:5:"16:30";}
{s:12:"cfdb7_status";s:6:"unread";s:3:"Day";s:6:"Sunday";s:9:"startTime";s:5:"12:30";s:7:"endTime";s:5:"14:30";}
{s:12:"cfdb7_status";s:6:"unread";s:3:"Day";s:7:"Tuesday";s:9:"startTime";s:5:"14:30";s:7:"endTime";s:5:"16:30";}
这是执行解析以获取上述内容的代码:
try:
from urllib.request import urlopen
except ImportError:
from urllib2 import urlopen
from bs4 import BeautifulSoup
url = 'http://dnedesign.us.to/tables/'
page = urlopen(url)
soup = BeautifulSoup(page, "html.parser")
for rows in soup.find_all('tr'):
for td in rows.find_all('td'):
if 'a:' in td.text:
print(td.text[4:])
我正在尝试将其解析为以下内容:
Day:Tuesday Starttime:14:30 Endtime:16:30
Day:Sunday Starttime:12:30 Endtime:14:30
Day:Sunday Starttime:12:30 Endtime:16:30
Day:Sunday Starttime:12:30 Endtime:16:30
....
....
对桌子的其余部分依此类推。
我正在使用Python 3.6.9
和Httpie 0.9.8
。Linux Mint Cinnamon 19.1
这是我的毕业设计,任何帮助将不胜感激,谢谢。尼尔·M。