python - 如何让 Python 登录网站，加载预先保存的数据搜索，然后逐页导出数据

Question

我正在尝试编写一个 python 脚本，它将在网页上加载预先保存的数据搜索（这是通过库访问的 Orbis 数据集），然后将数据导出到 excel 或 CSV。这包括：从这里开始：

链接我需要点击“查看结果列表”选项卡（我可以这样做）
结果一次显示 25 家公司，可以通过单击“导出到 excel”按钮将其导出到 excel，然后我需要一个循环来更改结果页的条目（从 1 到40,000)，并一次导出每一页。（这我可以部分做到）

编辑 2： 为了缩小问题范围，我可以自动登录、设置搜索并进入导出页面。我正在使用碎片。但是，导出窗口是一个弹出窗口，splinter 无法（据我所知）导航到弹出窗口，填写导出条件并单击导出。

问题有没有办法（使用分裂或其他方式）导航到弹出窗口以便我与之通信？这是我的代码：

from splinter import Browser

browser = Browser('firefox')
browser.visit('https://weblogin.umich.edu/?cosign-www.lib&https://www.lib.umich.edu/cgi/l/login/proxy-session-init-qurl?qurl=https%3a%2f%2forbis2.bvdep.com%2fip')
browser.fill('login', 'username')
browser.fill('password', 'psswd')
browser.find_by_value('Log In').click()
browser.find_by_id('ContentContainer1_ctl00_Content_QuickSearch1_ctl02_TabSavedSearchesTd').click()

# Problem, here firefox doesn't save the searches
test_link= browser.find_link_by_text("My Search 1")
test_link.click()

#test entry into text field

# browser.fill('ContentContainer1$ctl00$Header$ctl00$ctl07$SearchText2008','xyz')

test_link= browser.find_link_by_text("Export")
test_link.click()

# Problem -- here the export comes out as a popup, then the scraper can't follow it
# browser.visit('newlink-popup')
# browser.fill('RANGEFROM', '1')  # Therefore can't use this command`

任何帮助将不胜感激。谢谢。

python - 如何让 Python 登录网站，加载预先保存的数据搜索，然后逐页导出数据

0 回答 0

Related

Reference