RCurl will default to a HTTP proxy, but Tor provides a SOCKS proxy. Tor is clever enough to understand that the proxy client (RCurl) is trying to use a HTTP proxy, hence the error message in HTML returned by Tor.
In order to get RCurl, and curl, to use a SOCKS proxy, you can use a protocol prefix, and there are two protocol prefixes for SOCKS5: "socks5" and "socks5h" (see the Curl manual). The latter will let the SOCKS server handle DNS-queries, which is the preferred method when using Tor (in fact, Tor will warn you if you let the proxy client resolve the hostname).
Here is a pure R solution which will use Tor for dns-queries.
library(RCurl)
options(RCurlOptions = list(proxy = "socks5h://127.0.0.1:9050"))
my.handle <- getCurlHandle()
html <- getURL(url='https://www.torproject.org', curl=my.handle)
If you want to specify additional parameters, see below on where to put them:
library(RCurl)
options(RCurlOptions = list(proxy = "socks5h://127.0.0.1:9050",
useragent = "Mozilla",
followlocation = TRUE,
referer = "",
cookiejar = "my.cookies.txt"
)
)
my.handle <- getCurlHandle()
html <- getURL(url='https://www.torproject.org', curl=my.handle)