我是python新手,正在尝试使用scholar.py和TOR构建一个谷歌学术刮板。不幸的是,当我运行下面的代码时:
import scholar
import csv
import socks
import socket
import urllib2
import urllib
import httplib
from TorCtl import TorCtl
socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "127.0.0.1", 9050, True)
socket.socket = socks.socksocket
proxy_support = urllib2.ProxyHandler({"http" : "127.0.0.1.8118"})
opener = urllib2.build_opener(proxy_support)
def connectTor():
socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "127.0.0.1", 9050, True)
socket.socket = socks.socksocket
print "Connected to Tor"
def newId():
socks.setdefaultproxy()
conn = TorCtl.connect(controlAddr="127.0.0.1", controlPort=9051, passphrase="123")
TorCtl.Connection.send_signal(conn, "NEWNYM")
conn.close()
connectTor()
connectTor()
for i in range(0, 10):
print "case "+str(i+1)
newId()
conn = httplib.HTTPConnection("my-ip.heroku.com")
conn.request("GET", "/")
response = conn.getresponse()
print(response.read())
一切都很好,并且返回了 IP 地址。但是,如果我放弃:
conn = httplib.HTTPConnection("my-ip.heroku.com")
conn.request("GET", "/")
response = conn.getresponse()
print(response.read())
并将其替换为
urllib2.install_opener(opener)
print(urllib2.urlopen("http://my-ip.heroku.com/").read())
然后我收到错误消息:“URLError urlopen 错误 [Errno 11004] getaddrinfo failed。”
Academic.py 使用 urllib2,所以我需要它来工作。任何想法表示赞赏。