是否可以让 selenium 使用 TOR 浏览器?有没有人可以复制粘贴的代码?
10 回答
不要使用 TBB,只需在您使用的任何浏览器中设置正确的代理设置。例如在 FF 中,像这样:
#set some privacy settings
ff_prof.set_preference( "places.history.enabled", False )
ff_prof.set_preference( "privacy.clearOnShutdown.offlineApps", True )
ff_prof.set_preference( "privacy.clearOnShutdown.passwords", True )
ff_prof.set_preference( "privacy.clearOnShutdown.siteSettings", True )
ff_prof.set_preference( "privacy.sanitize.sanitizeOnShutdown", True )
ff_prof.set_preference( "signon.rememberSignons", False )
ff_prof.set_preference( "network.cookie.lifetimePolicy", 2 )
ff_prof.set_preference( "network.dns.disablePrefetch", True )
ff_prof.set_preference( "network.http.sendRefererHeader", 0 )
#set socks proxy
ff_prof.set_preference( "network.proxy.type", 1 )
ff_prof.set_preference( "network.proxy.socks_version", 5 )
ff_prof.set_preference( "network.proxy.socks", '127.0.0.1' )
ff_prof.set_preference( "network.proxy.socks_port", 9050 )
ff_prof.set_preference( "network.proxy.socks_remote_dns", True )
#if you're really hardcore about your security
#js can be used to reveal your true i.p.
ff_prof.set_preference( "javascript.enabled", False )
#get a huge speed increase by not downloading images
ff_prof.set_preference( "permissions.default.image", 2 )
##
# programmatically start tor (in windows environment)
##
tor_path = "C:\\this\\is\\the\\location\\of\\" #tor.exe
torrc_path = "C:\\you\\need\\to\\create\\this\\file\\torrc"
DETACHED_PROCESS = 0x00000008
#calling as a detached_process means the program will not die with your python program - you will need to manually kill it
##
# somebody please let me know if there's a way to make this a child process that automatically dies (in windows)
##
tor_process = subprocess.Popen( '"' + tor_path+'tor.exe" --nt-service "-f" "' + torrc_path + '"', creationflags=DETACHED_PROCESS )
#attach to tor controller
## imports ##
# import stem.socket
# import stem.connection
# import stem.Signal
##
tor_controller = stem.socket.ControlPort( port=9051 )
control_password = 'password'
#in your torrc, you need to store the hashed version of 'password' which you can get with: subprocess.call( '"' + tor_path+'tor.exe" --hash-password %s' %control_password )
stem.connection.authenticate( tor_controller, password=control_password )
#check that everything is good with your tor_process by checking bootstrap status
tor_controller.send( 'GETINFO status/bootstrap-phase' )
response = worker.tor_controller.recv()
response = response.content()
#I will leave handling of response status to you
是的,可以让 selenium 使用 TOR 浏览器。
我能够在 Ubuntu 和 Mac OS X 上这样做。
有两件事必须发生:
将二进制路径设置为 Tor 使用的 firefox 二进制文件。在 Mac 上,此路径通常为
/Applications/TorBrowser.app/Contents/MacOS/firefox
. 在我的 Ubuntu 机器上是/usr/bin/tor-browser/Browser/firefox
.Tor 浏览器通过 Vidalia 或 Tor 安装使用位于 127.0.0.1:9150 的 SOCKS 主机。从 Finder 启动一次 Tor 并保持打开状态,以便 Vidalia 运行。使用 selenium 启动的实例也将使用 Vidalia 启动的 SOCKS 主机。
这是完成这两件事的代码。我在 Mac OS X Yosemite 上运行它:
import os
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
from selenium import webdriver
# path to the firefox binary inside the Tor package
binary = '/Applications/TorBrowser.app/Contents/MacOS/firefox'
if os.path.exists(binary) is False:
raise ValueError("The binary path to Tor firefox does not exist.")
firefox_binary = FirefoxBinary(binary)
browser = None
def get_browser(binary=None):
global browser
# only one instance of a browser opens, remove global for multiple instances
if not browser:
browser = webdriver.Firefox(firefox_binary=binary)
return browser
if __name__ == "__main__":
browser = get_browser(binary=firefox_binary)
urls = (
('tor browser check', 'https://check.torproject.org/'),
('ip checker', 'http://icanhazip.com')
)
for url_name, url in urls:
print "getting", url_name, "at", url
browser.get(url)
在 Ubuntu 系统上,我可以通过 selenium 运行 Tor 浏览器。这台机器在端口 9051 上运行 Tor,在端口 8118 上使用 Tor 的 privoxy http 代理。为了让 Tor 浏览器通过 Tor 检查页面,我必须将 http 代理设置为 privoxy。
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
from selenium.webdriver.common.proxy import Proxy, ProxyType
from selenium import webdriver
browser = None
proxy_address = "127.0.0.1:8118"
proxy = Proxy({
'proxyType': ProxyType.MANUAL,
'httpProxy': proxy_address,
})
tor = '/usr/bin/tor-browser/Browser/firefox'
firefox_binary = FirefoxBinary(tor)
urls = (
('tor_browser_check', 'https://check.torproject.org/'),
('icanhazip', 'http://icanhazip.com'),
)
keys, _ = zip(*urls)
urls_map = dict(urls)
def get_browser(binary=None, proxy=None):
global browser
if not browser:
browser = webdriver.Firefox(firefox_binary=binary, proxy=proxy)
return browser
if __name__ == "__main__":
browser = get_browser(binary=firefox_binary, proxy=proxy)
for resource in keys:
browser.get(urls_map.get(resource))
//只需检查您的 Tor 浏览器的端口号并在 //code 中相应地更改它
from selenium import webdriver
profile=webdriver.FirefoxProfile()
profile.set_preference('network.proxy.type', 1)
profile.set_preference('network.proxy.socks', '127.0.0.1')
profile.set_preference('network.proxy.socks_port', 9150)
browser=webdriver.Firefox(profile)
browser.get("http://yahoo.com")
browser.save_screenshot("screenshot.png")
browser.close()
To open tor browser with Selenium driven GeckoDriver you need to:
Download and install the TOR Browser
Download the latest GeckoDriver v0.26.0 and place it in your system.
Install the recent Mozilla Firefox v77.0.1 browser.
You can use the following code block to open the TOR enabled browser:
from selenium import webdriver from selenium.webdriver.firefox.firefox_profile import FirefoxProfile import os torexe = os.popen(r'C:\Users\username\Desktop\Tor Browser\Browser\TorBrowser\Tor\tor.exe') profile = FirefoxProfile(r'C:\Users\username\Desktop\Tor Browser\Browser\TorBrowser\Data\Browser\profile.default') profile.set_preference('network.proxy.type', 1) profile.set_preference('network.proxy.socks', '127.0.0.1') profile.set_preference('network.proxy.socks_port', 9050) profile.set_preference("network.proxy.socks_remote_dns", False) profile.update_preferences() firefox_options = webdriver.FirefoxOptions() firefox_options.binary_location = r'C:\Program Files\Mozilla Firefox\firefox.exe' driver = webdriver.Firefox(firefox_profile= profile, options = firefox_options, executable_path=r'C:\WebDrivers\geckodriver.exe') driver.get("http://check.torproject.org")
Browser Snapshot:
Alternative using Firefox Nightly
As an alternative you can also download, install and use the recent Firefox Nightly v79.0a1 browser.
Code Block:
from selenium import webdriver from selenium.webdriver.firefox.firefox_profile import FirefoxProfile import os torexe = os.popen(r'C:\Users\username\Desktop\Tor Browser\Browser\TorBrowser\Tor\tor.exe') profile = FirefoxProfile(r'C:\Users\username\Desktop\Tor Browser\Browser\TorBrowser\Data\Browser\profile.default') profile.set_preference('network.proxy.type', 1) profile.set_preference('network.proxy.socks', '127.0.0.1') profile.set_preference('network.proxy.socks_port', 9050) profile.set_preference("network.proxy.socks_remote_dns", False) profile.update_preferences() firefox_options = webdriver.FirefoxOptions() firefox_options.binary_location = r'C:\Program Files\Firefox Nightly\firefox.exe' driver = webdriver.Firefox(firefox_profile= profile, options = firefox_options, executable_path=r'C:\WebDrivers\geckodriver.exe') driver.get("http://check.torproject.org")
Browser Snapshot:
Alternative using Chrome
As an alternative you can also download, install and use the recent Chrome v84 browser.
Code Block:
from selenium import webdriver import os torexe = os.popen(r'C:\Users\username\Desktop\Tor Browser\Browser\TorBrowser\Tor\tor.exe') PROXY = "socks5://localhost:9050" # IP:PORT or HOST:PORT options = webdriver.ChromeOptions() options.add_argument('--proxy-server=%s' % PROXY) driver = webdriver.Chrome(chrome_options=options, executable_path=r'C:\WebDrivers\chromedriver.exe') driver.get("http://check.torproject.org")
Browser Snapshot:
References
You can find a couple of relevant detailed discussions in:
很多答案都朝着正确的方向发展,但这正是对我有用的:
在 Ubuntu 上:
您需要使用 apt 命令或其他方法安装 Tor,而不是二进制版本。
安装指南:
https://linuxconfig.org/how-to-install-tor-browser-in-ubuntu-18-04-bionic-beaver-linux
在 sample.py 中,您可能需要:
- 设置 Firefox 的配置文件,
torrc
其中大部分时间位于/etc/tor/
. - 将二进制文件设置为 Tor 的 Firefox 二进制文件,因为 Tor 只是构建在 Firefox 之上的一系列配置。
您还需要 geckodriver 使用 selenium 自动执行 firefox:
- https://github.com/mozilla/geckodriver/releases(适用于 0.21.0)
- 提炼
chmod +x geckodriver
export PATH=$PATH:/path-to-extracted-file/geckodriver
注意:
- “network.proxy.socks_port”= 9150
- 内部 torrc ControlPort 9050, CookieAuthentication 1
- 打开 Tor 浏览器
sudo lsof -i -P -n | grep LISTEN
脚本中tor网络的LISTEN端口必须相同- 在TorBrowser 打开时运行 python 脚本
感谢 user2426679 https://stackoverflow.com/a/21836296/3816638的设置。
from selenium import webdriver
from selenium.webdriver.firefox.firefox_profile import FirefoxProfile
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
from selenium.webdriver.common.proxy import Proxy, ProxyType
from selenium.webdriver.firefox.options import Options
import subprocess
import os
profileTor = '/etc/tor/' # torrc
binary = os.path.expanduser("~/.local/share/torbrowser/tbb/x86_64/tor-browser_en-US/Browser/firefox")
firefox_binary = FirefoxBinary(binary)
firefox_profile = FirefoxProfile(profileTor)
#set some privacy settings
firefox_profile.set_preference( "places.history.enabled", False )
firefox_profile.set_preference( "privacy.clearOnShutdown.offlineApps", True )
firefox_profile.set_preference( "privacy.clearOnShutdown.passwords", True )
firefox_profile.set_preference( "privacy.clearOnShutdown.siteSettings", True )
firefox_profile.set_preference( "privacy.sanitize.sanitizeOnShutdown", True )
firefox_profile.set_preference( "signon.rememberSignons", False )
firefox_profile.set_preference( "network.cookie.lifetimePolicy", 2 )
firefox_profile.set_preference( "network.dns.disablePrefetch", True )
firefox_profile.set_preference( "network.http.sendRefererHeader", 0 )
#set socks proxy
firefox_profile.set_preference( "network.proxy.type", 1 )
firefox_profile.set_preference( "network.proxy.socks_version", 5 )
firefox_profile.set_preference( "network.proxy.socks", '127.0.0.1' )
firefox_profile.set_preference( "network.proxy.socks_port", 9150 )
firefox_profile.set_preference( "network.proxy.socks_remote_dns", True )
#if you're really hardcore about your security
#js can be used to reveal your true i.p.
firefox_profile.set_preference( "javascript.enabled", False )
#get a huge speed increase by not downloading images
firefox_profile.set_preference( "permissions.default.image", 2 )
options = Options()
options.set_headless(headless=False)
driver = webdriver.Firefox(firefox_profile=firefox_profile,firefox_options=options)
print(driver)
driver.get("https://check.torproject.org/")
driver.save_screenshot("screenshot.png")
from selenium import webdriver
from selenium.webdriver.firefox.firefox_profile import FirefoxProfile
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
#path to TOR binary
binary = FirefoxBinary(r'...\Tor Browser\Browser\firefox.exe')
#path to TOR profile
profile = FirefoxProfile(r'...\Tor Browser\Browser\TorBrowser\Data\Browser\profile.default')
driver = webdriver.Firefox(firefox_profile= profile, firefox_binary= binary)
driver.get("http://icanhazip.com")
driver.save_screenshot("screenshot.png")
driver.quit()
在 Windows 10 上使用 Python 3.5.1
使用红宝石,
profile = Selenium::WebDriver::Firefox::Profile.new
profile.proxy = Selenium::WebDriver::Proxy.new :socks => '127.0.0.1:9050' #port where TOR runs
browser = Watir::Browser.new :firefox, :profile => profile
要确认您使用的是 Tor,请使用https://check.torproject.org/
我对此进行了调查,除非我弄错了,否则从表面上看这是不可能的。
不能这样做的原因是:
- Tor 浏览器基于 Firefox 代码。
- Tor 浏览器对 Firefox 代码有特定的补丁,以防止外部应用程序与 Tor 浏览器通信(包括阻止 Components.Interfaces 库)。
- Selenium Firefox WebDriver 通过 Javascript 库与浏览器通信,如上所述,这些库被 Tor 浏览器阻止。
这大概是因为 Tor 浏览器之外的任何人,无论是在您的盒子上还是在互联网上,都不会知道您的浏览。
您的替代方案是:
- 通过 Firefox 而不是 Tor 浏览器使用 Tor 代理(请参阅问题评论中的链接)。
- 使用 Tor 浏览器补丁重建 Firefox 源代码,不包括那些阻止与 Tor 浏览器进行外部通信的补丁。
我建议前者。
System.setProperty("webdriver.firefox.marionette", "D:\\Lib\\geckodriver.exe");
String torPath = "C:\\Users\\HP\\Desktop\\Tor Browser\\Browser\\firefox.exe";
String profilePath = "C:\\Users\\HP\\Desktop\\Tor Browser\\Browser\\TorBrowser\\Data\\Browser\\profile.default";
File torProfileDir = new File(profilePath);
FirefoxBinary binary = new FirefoxBinary(new File(torPath));
FirefoxProfile torProfile = new FirefoxProfile(torProfileDir);
FirefoxOptions options = new FirefoxOptions();
options.setBinary(binary);
options.setProfile(torProfile);
options.setCapability(FirefoxOptions.FIREFOX_OPTIONS,options);
WebDriver driver = new FirefoxDriver(options);
作为仅控制 Firefox 的 Selenium 的更新替代品,请查看Marionette。要与 Tor 浏览器一起使用,请在启动时通过以下方式启用木偶
Browser/firefox -marionette
(在捆绑包内)。然后,您可以通过连接
from marionette import Marionette
client = Marionette('localhost', port=2828);
client.start_session()
并加载一个新页面,例如通过
url='http://mozilla.org'
client.navigate(url);
有关更多示例,请参阅教程。
较旧的答案
Tor 项目对其浏览器进行了selenium 测试。它的工作原理如下:
from selenium import webdriver
ffbinary = webdriver.firefox.firefox_binary.FirefoxBinary(firefox_path=os.environ['TBB_BIN'])
ffprofile = webdriver.firefox.firefox_profile.FirefoxProfile(profile_directory=os.environ['TBB_PROFILE'])
self.driver = webdriver.Firefox(firefox_binary=ffbinary, firefox_profile=ffprofile)
self.driver.implicitly_wait(30)
self.base_url = "about:tor"
self.verificationErrors = []
self.accept_next_alert = True
self.driver.get("http://check.torproject.org/")
self.assertEqual("Congratulations. This browser is configured to use Tor.", driver.find_element_by_css_selector("h1.on").text)
如您所见,这使用环境变量TBB_BIN
以及TBB_PROFILE
浏览器包和配置文件。您可能可以在您的代码中硬编码这些。