0

我对 python 很陌生,我在从一页移动到另一页时感到震惊,我能够抓取一页的详细信息。下面是我正在使用的代码

def getURLinfo(url):
            url = "https://apps1.coned.com/cemyaccount/MemberPages/MyAccounts.aspx?lang=eng"
            driver.get(url)
            html = driver.page_source
            nextpage = "ctl00$Main$DataPager1$ctl01$ctl01"
            soup = BeautifulSoup(html)

            while soup.find(id=re.compile(nextpage)):
                    for table in soup.findAll('table', {'id':'ctl00_Main_lvMyAccount_itemPlaceholderContainer'} ):
                            for link in table.findAll('a'):
                                    link.findAll('a')
                                    print link['href']
                    driver.find_element_by_link_text(nextpage).click()
                    html = html + driver.page_source
                    soup = BeautifulSoup(driver.page_source)

                    soup = BeautifulSoup(html)

    driver.close()

我不确定我是否也在正确的轨道上。

下面是html代码查看211538138800143 43-38 39 PLAC 35 JUAN MENDOZA 主动删除

          <tr style="background-color:#EFEFEF">
                <td>
                    <a id="ctl00_Main_lvMyAccount_ctrl17_lnkSelect" href="javascript:__doPostBack('ctl00$Main$lvMyAccount$ctrl17$lnkSelect','')">View </a>
                </td>
                <td>
                    <span id="ctl00_Main_lvMyAccount_ctrl17_lblAcctNumber">211558100500042</span>
                </td>
                <td>
                    <span id="ctl00_Main_lvMyAccount_ctrl17_LblServiceAddress">41-12 41 ST ENTM         </span>
                </td>
                <td>
                    <span id="ctl00_Main_lvMyAccount_ctrl17_LblCustName">41-12 41ST MGMT CORP.</span>
                </td>
                <td>
                    <span id="ctl00_Main_lvMyAccount_ctrl17_LblAcctStatus">Active</span>
                </td>
                <td>
                    <a onclick="return confirm('Are you sure you want to delete this account number?');" id="ctl00_Main_lvMyAccount_ctrl17_lnkDelete" href="javascript:__doPostBack('ctl00$Main$lvMyAccount$ctrl17$lnkDelete','')">Delete </a>
                </td>
            </tr>



            <tr>
                <td>
                    <a id="ctl00_Main_lvMyAccount_ctrl18_lnkSelect" href="javascript:__doPostBack('ctl00$Main$lvMyAccount$ctrl18$lnkSelect','')">View </a>
                </td>
                <td>
                    <span id="ctl00_Main_lvMyAccount_ctrl18_lblAcctNumber">211558102300045</span>
                </td>
                <td>
                    <span id="ctl00_Main_lvMyAccount_ctrl18_LblServiceAddress">41-12 41 ST 1D           </span>
                </td>
                <td>
                    <span id="ctl00_Main_lvMyAccount_ctrl18_LblCustName">41-12 MGMT CORP      </span>
                </td>
                <td>
                    <span id="ctl00_Main_lvMyAccount_ctrl18_LblAcctStatus">Active</span>
                </td>
                <td valign="top">
                    <a onclick="return confirm('Are you sure you want to delete this account number?');" id="ctl00_Main_lvMyAccount_ctrl18_lnkDelete" href="javascript:__doPostBack('ctl00$Main$lvMyAccount$ctrl18$lnkDelete','')">Delete </a>
                </td>
            </tr>



          <tr style="background-color:#EFEFEF">
                <td>
                    <a id="ctl00_Main_lvMyAccount_ctrl19_lnkSelect" href="javascript:__doPostBack('ctl00$Main$lvMyAccount$ctrl19$lnkSelect','')">View </a>
                </td>
                <td>
                    <span id="ctl00_Main_lvMyAccount_ctrl19_lblAcctNumber">211564295000053</span>
                </td>
                <td>
                    <span id="ctl00_Main_lvMyAccount_ctrl19_LblServiceAddress">47-07 39 ST HLSM         </span>
                </td>
                <td>
                    <span id="ctl00_Main_lvMyAccount_ctrl19_LblCustName">QPII-47-07 39 ST.,LLC</span>
                </td>
                <td>
                    <span id="ctl00_Main_lvMyAccount_ctrl19_LblAcctStatus">Active</span>
                </td>
                <td>
                    <a onclick="return confirm('Are you sure you want to delete this account number?');" id="ctl00_Main_lvMyAccount_ctrl19_lnkDelete" href="javascript:__doPostBack('ctl00$Main$lvMyAccount$ctrl19$lnkDelete','')">Delete </a>
                </td>
            </tr>

        </table>

                    </td>
</tr>

    </td>
</tr> 

<tr align="center"><td>
    <span id="ctl00_Main_DataPager1"><a disabled="disabled"><< </a>&nbsp;<span>1</span>&nbsp;<a href="javascript:__doPostBack('ctl00$Main$DataPager1$ctl01$ctl01','')">2</a>&nbsp;<a href="javascript:__doPostBack('ctl00$Main$DataPager1$ctl01$ctl02','')">3</a>&nbsp;<a href="javascript:__doPostBack('ctl00$Main$DataPager1$ctl01$ctl03','')">4</a>&nbsp;<a href="javascript:__doPostBack('ctl00$Main$DataPager1$ctl01$ctl04','')">5</a>&nbsp;&nbsp;<a href="javascript:__doPostBack('ctl00$Main$DataPager1$ctl01$ctl05','')">...</a>&nbsp;<a href="javascript:__doPostBack('ctl00$Main$DataPager1$ctl02$ctl00','')"> >></a>&nbsp;</span> 

</td></tr>

</table>

4

0 回答 0