1

I am trying to create a program that uses a list of proxies iteratively so each proxy will be used from beginning to finish then starting all over again. The way to use proxies in request seems to be like following.

proxyDict = { 
              "http"  : "http://177.86.8.166:3128", 
              "http" : "http://177.223.187.126:3128" 
            }

r = requests.get(url, headers=headers, proxies=proxyDict)

I have a big list of proxies like below.

177.86.8.166:3128
177.69.237.53:3128
177.223.187.126:3128
177.101.172.14:3128
177.185.114.89:53281
177.128.192.125:8089
177.128.210.250:8080

I have thought about using a loop to append all these proxies in a proxyDict var in memory. Than run my program. Is this the best way to do. I also want to repeat a request in case the proxy fails to work properly with another proxy and this should continue until request is made successfully. I am thinking of using a try catch block for this is this the best way to do it? Or is there a better way.

4

1 回答 1

0

尽管我使用了 grequests,但我刚刚做了类似的事情。给你一些想法..我会在你的请求中添加一个超时,否则你的代码将挂起:

>>>> r = requests.get(url, headers=headers, proxies=my_proxy, timeout=5)

每个请求都会有一个 status_code,所以使用它来检查请求是否成功,我通常会尝试几次以防万一出现超时,例如:

>>> import requests
>>> r = requests.get('http://notarealsiteatall.org/status/404')
>>> r.status_code
404

然后,如果请求失败 5 次,您可以移动到下一个代理。

if tries > 5:
    my_proxy = new_proxy_server

我刚刚创建了一个列表并做了一个 for 循环遍历它们。

于 2017-07-15T08:16:39.827 回答