0

我使用 mechanize 创建虚拟浏览器

        br = mechanize.Browser()

        # set cookies
        cookies = cookielib.LWPCookieJar()
        br.set_cookiejar(cookies)

        # browser settings (used to emulate a browser)
        br.set_handle_equiv(True)
        br.set_handle_redirect(True)
        br.set_handle_referer(True)
        br.set_handle_robots(False)
        br.set_debug_http(False)
        br.set_debug_responses(False)
        br.set_debug_redirects(False)
        br.set_handle_refresh(mechanize.HTTPRefreshProcessor(), max_time=1)
        br.addheaders = [('User-agent', 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1')]
        br.open("http://www.whatsmyip.org/")

虽然网站返回了我的ip,但他们给出了通知

Please DO NOT program a bot to use this site to grab your IPs. It kills my server and thats not nice. Just get some cheap or free web hosting and make your own IP-only page to power your bot. Then you won't even have to parse any html, just load the IP directly - better for everyone!!

为什么网站会知道?我错过了代码中的某些内容吗?

4

1 回答 1

0

我测试了你的代码,一切正常。

你的意思是这部分:

<!--
    Please DO NOT program a bot to use this site to grab your IPs. It kills my server and thats not nice.
    Just get some cheap or free web hosting and make your own IP-only page to power your bot.
    Then you won't even have to parse any html, just load the IP directly - better for everyone!!                               
-->

如果是,那么它只是评论标签,它只是用来提醒用户不要使用带有机器人的网站。

它不会“抓住”你或任何东西。如果您将转到whatsmyip页面并打开源代码,您将看到它从第 24 行开始(即使您在浏览器中打开它)。

所以总而言之,这只是一个警告,婚礼开发人员在 HTML 中插入

于 2013-04-10T14:15:46.850 回答