0

大家好!使用 firefox 插件,我可以捕获标题。我想在 python 中执行此操作:我需要将浏览器代理设置更改为 localhost:8080 (或任何端口),然后浏览器发出的每个请求都应该通过实际运行在我的机器上的 python 脚本。该脚本应该能够捕获标题,捕获网页中的链接等。我知道 Web 应用程序扫描仪会这样做,但我怎么能在 python 中做到这一点。你能建议任何开始的地方,阅读吗?我只想了解它并实施一个。

4

1 回答 1

3

Have a look at python-proxy. Googling for "python proxy" also yields tons of results.

If you want to write one from scratch it's also not too hard. You can use BaseHTTPServer to listen for new connections, make it multithreaded via SocketServer.ThreadingMixIn and then implement do_GET and do_CONNECT (possibly also do_POST and do_HEAD). In those methods you need to extract the URL from self.path, send a HTTP request to that URL (preferably using the requests package, it's much more comfortable than urllib) and send the response back to the client.

于 2012-05-18T09:32:03.660 回答