我刚刚安装了Ghost.py,以便抓取一些需要我拥有 javascript 的网站。无论如何,在当前页面上是否有一个可迭代的表单列表,就像 mechanize 模块一样mechanize.Browser().forms()
?或者,如果不是,我可以将页面(在所有 javascript 内容加载后)传递给 mechanize 库并让它填写/提交表格吗?
问问题
654 次
1 回答
0
如果您不介意在屏幕上弹出浏览器,Selenium 可以为您做到这一点。它也可以无头运行,但这更棘手。一个简单的解决方案:
from selenium import webdriver
driver = webdriver.Firefox()
url = "http://www.w3schools.com/html/html_forms.asp"
driver.get(url)
# get a list of the page's forms as Selenium WebElements
# (webdriver API ref: http://selenium-python.readthedocs.org/en/latest/api.html)
forms = driver.find_elements_by_xpath(".//form")
for i, form in enumerate(forms):
print i, form.text
# the last form, index number 5, has input tags of type "text" and "submit"
"""
<form name="input0" target="_blank" action="html_form_action.asp" method="get">
"
Username: "
<input type="text" name="user" size="20">
<input type="submit" value="Submit">
</form>
"""
# get the input WebElements from this form WebElement
inputs = forms[5].find_elements_by_xpath(".//input")
# write text to the text input, then submit the form
inputs[0].send_keys('hihi frds!')
inputs[1].submit()
于 2013-06-10T18:43:32.217 回答