我是一个新手,试图使用 Python 3 和 Mechanical Soup 建立一个 webscraper。我试图抓取的网站需要登录,我似乎无法提交我的登录详细信息。
browser = mechanicalsoup.Browser()
# Request page
login_page = browser.get("https://www.website.com/login")
login_form = login_page.soup.find("form", {"id":"login"})
# specify username and password
login_form.find("input", {"id": "username"})["value"] = 'myUsername'
login_form.find("input", {"id": "password"})["value"] = 'myPassword'
# submit form
response = browser.submit(login_form, login_page.url)
我得到的错误来自最后一行(response = browser.submit ...)。
Traceback (most recent call last):
File "C:\blahblahblah\webscraper.py", line 144, in <module>
main()
File "C:\blahblahblah\webscraper.py", line 69, in main
response = browser.submit(login_form, login_page.url)
File "C:\Program Files (x86)\Python\lib\site-packages\mechanicalsoup\browser.py", line 117, in submit
request = self._prepare_request(form, url, **kwargs)
File "C:\Program Files (x86)\Python\lib\site-packages\mechanicalsoup\browser.py", line 111, in _prepare_request
request = self._build_request(form, url, **kwargs)
File "C:\Program Files (x86)\Python\lib\site-packages\mechanicalsoup\browser.py", line 54, in _build_request
for input in form.select("input"):
TypeError: 'NoneType' object is not callable
打印出 login_form 提供
<form accept-charset="UTF-8" action="https://www.website.com/login" class="content-container blockform" id="login" method="POST"><input name="_token" type="hidden" value="h8Sg8QdB08xyLQ4pUPcdnOHb90h0mHJownN7E7V0" />
<h1>Login</h1>
<fieldset>
<legend>Login Information</legend>
<dl>
<dt><label for="username">Username</label></dt>
<dd>
<input autofocus="autofocus" id="username" name="username" required="required" type="text" />
</dd>
<dt><label for="password">Password</label></dt>
<dd>
<input id="password" name="password" required="required" type="password" value="" />
</dd>
</dl>
</fieldset>
<p><strong>Note:</strong> You will remain logged in until you press the logout button.</p>
<a class="button" href="https://www.website.com/recovery"><i class="fa fa-envelope"></i> Forgot password?</a>
<button class="button green" type="submit"><i class="fa fa-sign-in"></i> Login</button>
</form>
并打印 login_page.url 提供
https://www.website.com/login
我正在尝试遵循本教程: http: //piratefache.ch/python-3-mechanize-and-beautifulsoup/
任何人都知道发生了什么并且可以提供帮助吗?