我正在尝试使用 Python、tor 和 privoxy 运行 Scrapy。我在 https://github.com/khpeek/privoxy-tor-scraper使用 khpeek/privoxy-tor-scraper 的刮板。这是我的目录结构:
- docker-compose.yml
- privoxy
- config
- Dockerfile
- scraper
- Dockerfile
- newnym.py
- requirements.txt
- tor
- Dockerfile
我正在尝试运行以下docker-compose.yml:
version: '3'
services:
privoxy:
build: ./privoxy
ports:
- "8118:8118"
links:
- tor
tor:
build:
context: ./tor
args:
password: "1234"
ports:
- "9050:9050"
- "9051:9051"
scraper:
build: ./scraper
links:
- tor
- privoxy
Tor的Dockerfile在哪里:
FROM alpine:3.7
EXPOSE 9050 9051
ARG password
RUN apk --update add tor
RUN echo "ControlPort 9051" >> /etc/tor/torrc
RUN echo "CookieAuthentication 1" >> /etc/tor/torrc
RUN echo "HashedControlPassword $(tor --quiet --hash-password $password)" >> /etc/tor/torrc
CMD ["tor"]
privoxy的帽子是:
FROM alpine:latest
EXPOSE 8118
RUN apk --update add privoxy
COPY config /etc/privoxy/
#CMD ["privoxy", "--no-daemon"]
CMD ["privoxy", "--no-daemon", "/etc/privoxy/config"]
其中config由两行组成:
listen-address 0.0.0.0:8118
forward-socks5 / tor:9050 .
刮板的Dockerfile是:
FROM python:3.6-alpine
ADD . /scraper
WORKDIR /scraper
RUN pip install --upgrade pip
RUN pip install -r requirements.txt
CMD ["python", "newnym.py"]
其中requirements.txt包含单行requests。最后,newnym.py程序旨在简单地测试使用 Tor 更改 IP 地址是否有效:
from time import sleep, time
import requests as req
import telnetlib
def get_ip():
IPECHO_ENDPOINT = 'http://ipecho.net/plain'
HTTP_PROXY = 'http://privoxy:8118'
return req.get(IPECHO_ENDPOINT, proxies={'http': HTTP_PROXY}).text
def request_ip_change():
#tn = telnetlib.Telnet('privoxy',8118)
tn = telnetlib.Telnet('tor',9051)
tn.read_until("Escape character is '^]'.", 2)
tn.write('AUTHENTICATE ""\r\n')
tn.read_until("250 OK", 2)
tn.write("signal NEWNYM\r\n")
tn.read_until("250 OK", 2)
if __name__ == '__main__':
dts = []
#isOpen('tor',9051)
#isOpen('privoxy',8118)
try:
while True:
ip = get_ip()
t0 = time()
request_ip_change()
while True:
new_ip = get_ip()
if new_ip == ip:
sleep(1)
else:
break
dt = time() - t0
dts.append(dt)
print("{} -> {} in ~{}s".format(ip, new_ip, int(dt)))
except KeyboardInterrupt:
print("Stopping...")
print("Average: {}".format(sum(dts) / len(dts)))
docker -compose构建成功,但如果我尝试docker-compose up,我会收到以下错误消息:
scraper_1_651fd6690a2d | Traceback (most recent call last):
scraper_1_651fd6690a2d | File "newnym.py", line 45, in <module>
scraper_1_651fd6690a2d | request_ip_change()
scraper_1_651fd6690a2d | File "newnym.py", line 27, in request_ip_change
scraper_1_651fd6690a2d | tn = telnetlib.Telnet('tor',9051)
scraper_1_651fd6690a2d | File "/usr/local/lib/python3.6/telnetlib.py", line 218, in __init__
scraper_1_651fd6690a2d | self.open(host, port, timeout)
scraper_1_651fd6690a2d | File "/usr/local/lib/python3.6/telnetlib.py", line 234, in open
scraper_1_651fd6690a2d | self.sock = socket.create_connection((host, port), timeout)
scraper_1_651fd6690a2d | File "/usr/local/lib/python3.6/socket.py", line 724, in create_connection
scraper_1_651fd6690a2d | raise err
scraper_1_651fd6690a2d | File "/usr/local/lib/python3.6/socket.py", line 713, in create_connection
scraper_1_651fd6690a2d | sock.connect(sa)
scraper_1_651fd6690a2d | ConnectionRefusedError: [Errno 111] Connection refused