python - 如何使用 mechanize 下载生成的验证码？

翻译自：https://stackoverflow.com/questions/13391034 2012-11-15T03:13:09.157

1696 次

我正在尝试在我的国家建立一个博客平台，但博客平台有一个内部构建的验证码生成。

问题是 CAPTCHA 是这样构建的，因此每次有 GET 请求时都会生成一个新图像。所以假设验证码图像 URL 是这样的：http ://example.com/randomcaptcha.aspx?someparams-that-are-always-the-same

即使我在 Firefox 中打开上述链接并点击刷新（仅显示 JPG 图像），每次刷新时都会看到不同的图像。

出现问题是因为 mechanize 下载整个网页时，它也在该请求期间下载图像（或者更确切地说，它遵循 randomcaptcha.aspx 链接）。因此，当我再次尝试下载图像时，我需要发出另一个 GET 请求来抓取图像并下载它——此时图像已经改变。

我将如何解决这个问题？

谢谢你。

编辑当前的代码是这样的：

browser.open("http://www.example.com/registration.aspx") #this contains the randomcaptcha.aspx url in img src
#then we have a regex to find the url of the image, say the variable is url
with open("captcha.jpg", "wb") as file:
    file.write(browser.open_novisit(url).read())

此时下载的 captcha.jpg 文件已经与注册页面显示的不同。randomcaptcha.aspx我使用名为 Fiddler 的软件来查看 - 肯定有 2 个 GET 请求针对该url发出。

编辑#2 已解决：我的错。验证码 URL 不正确。

python - 如何使用 mechanize 下载生成的验证码？

0 回答 0

Related

Reference