我对 python 很陌生,我在从一页移动到另一页时感到震惊,我能够抓取一页的详细信息。下面是我正在使用的代码
def getURLinfo(url):
url = "https://apps1.coned.com/cemyaccount/MemberPages/MyAccounts.aspx?lang=eng"
driver.get(url)
html = driver.page_source
nextpage = "ctl00$Main$DataPager1$ctl01$ctl01"
soup = BeautifulSoup(html)
while soup.find(id=re.compile(nextpage)):
for table in soup.findAll('table', {'id':'ctl00_Main_lvMyAccount_itemPlaceholderContainer'} ):
for link in table.findAll('a'):
link.findAll('a')
print link['href']
driver.find_element_by_link_text(nextpage).click()
html = html + driver.page_source
soup = BeautifulSoup(driver.page_source)
soup = BeautifulSoup(html)
driver.close()
我不确定我是否也在正确的轨道上。
下面是html代码查看211538138800143 43-38 39 PLAC 35 JUAN MENDOZA 主动删除
<tr style="background-color:#EFEFEF">
<td>
<a id="ctl00_Main_lvMyAccount_ctrl17_lnkSelect" href="javascript:__doPostBack('ctl00$Main$lvMyAccount$ctrl17$lnkSelect','')">View </a>
</td>
<td>
<span id="ctl00_Main_lvMyAccount_ctrl17_lblAcctNumber">211558100500042</span>
</td>
<td>
<span id="ctl00_Main_lvMyAccount_ctrl17_LblServiceAddress">41-12 41 ST ENTM </span>
</td>
<td>
<span id="ctl00_Main_lvMyAccount_ctrl17_LblCustName">41-12 41ST MGMT CORP.</span>
</td>
<td>
<span id="ctl00_Main_lvMyAccount_ctrl17_LblAcctStatus">Active</span>
</td>
<td>
<a onclick="return confirm('Are you sure you want to delete this account number?');" id="ctl00_Main_lvMyAccount_ctrl17_lnkDelete" href="javascript:__doPostBack('ctl00$Main$lvMyAccount$ctrl17$lnkDelete','')">Delete </a>
</td>
</tr>
<tr>
<td>
<a id="ctl00_Main_lvMyAccount_ctrl18_lnkSelect" href="javascript:__doPostBack('ctl00$Main$lvMyAccount$ctrl18$lnkSelect','')">View </a>
</td>
<td>
<span id="ctl00_Main_lvMyAccount_ctrl18_lblAcctNumber">211558102300045</span>
</td>
<td>
<span id="ctl00_Main_lvMyAccount_ctrl18_LblServiceAddress">41-12 41 ST 1D </span>
</td>
<td>
<span id="ctl00_Main_lvMyAccount_ctrl18_LblCustName">41-12 MGMT CORP </span>
</td>
<td>
<span id="ctl00_Main_lvMyAccount_ctrl18_LblAcctStatus">Active</span>
</td>
<td valign="top">
<a onclick="return confirm('Are you sure you want to delete this account number?');" id="ctl00_Main_lvMyAccount_ctrl18_lnkDelete" href="javascript:__doPostBack('ctl00$Main$lvMyAccount$ctrl18$lnkDelete','')">Delete </a>
</td>
</tr>
<tr style="background-color:#EFEFEF">
<td>
<a id="ctl00_Main_lvMyAccount_ctrl19_lnkSelect" href="javascript:__doPostBack('ctl00$Main$lvMyAccount$ctrl19$lnkSelect','')">View </a>
</td>
<td>
<span id="ctl00_Main_lvMyAccount_ctrl19_lblAcctNumber">211564295000053</span>
</td>
<td>
<span id="ctl00_Main_lvMyAccount_ctrl19_LblServiceAddress">47-07 39 ST HLSM </span>
</td>
<td>
<span id="ctl00_Main_lvMyAccount_ctrl19_LblCustName">QPII-47-07 39 ST.,LLC</span>
</td>
<td>
<span id="ctl00_Main_lvMyAccount_ctrl19_LblAcctStatus">Active</span>
</td>
<td>
<a onclick="return confirm('Are you sure you want to delete this account number?');" id="ctl00_Main_lvMyAccount_ctrl19_lnkDelete" href="javascript:__doPostBack('ctl00$Main$lvMyAccount$ctrl19$lnkDelete','')">Delete </a>
</td>
</tr>
</table>
</td>
</tr>
</td>
</tr>
<tr align="center"><td>
<span id="ctl00_Main_DataPager1"><a disabled="disabled"><< </a> <span>1</span> <a href="javascript:__doPostBack('ctl00$Main$DataPager1$ctl01$ctl01','')">2</a> <a href="javascript:__doPostBack('ctl00$Main$DataPager1$ctl01$ctl02','')">3</a> <a href="javascript:__doPostBack('ctl00$Main$DataPager1$ctl01$ctl03','')">4</a> <a href="javascript:__doPostBack('ctl00$Main$DataPager1$ctl01$ctl04','')">5</a> <a href="javascript:__doPostBack('ctl00$Main$DataPager1$ctl01$ctl05','')">...</a> <a href="javascript:__doPostBack('ctl00$Main$DataPager1$ctl02$ctl00','')"> >></a> </span>
</td></tr>
</table>