0

我正在尝试获取 url 的源代码,但它给了我一个错误。请问你能帮帮我吗?

    curl -v http://www.segundamano.es/anuncios-madrid/ -m 10* About to connect() to www.segundamano.es port 80 (#0)
*   Trying 195.77.179.69...
* Connected to www.segundamano.es (195.77.179.69) port 80 (#0)
> GET /anuncios-madrid/ HTTP/1.1
> User-Agent: curl/7.29.0
> Host: www.segundamano.es
> Accept: */*
>
* Empty reply from server
* Connection #0 to host www.segundamano.es left intact
curl: (52) Empty reply from server

非常感谢和对不起我的英语!

4

1 回答 1

1

看起来这个域正在积极阻止 curl(和 wget)请求,如果您传递浏览器的 UserAgent,您似乎可以解决这个问题(curl 和 wget 对用户代理使用相同的命令行参数)。例如:

这不起作用:

C:\>wget http://www.segundamano.es/anuncios-madrid/
  SYSTEM_WGETRC = c:/progra~1/wget/etc/wgetrc
  syswgetrc = C:\Program Files (x86)\GnuWin32/etc/wgetrc
  --2013-10-16 10:06:13--  http://www.segundamano.es/anuncios-madrid/
  Resolving www.segundamano.es... 195.77.179.69, 213.4.96.70
  Connecting to www.segundamano.es|195.77.179.69|:80... connected.
  HTTP request sent, awaiting response... 502 Bad Gateway
  2013-10-16 10:06:15 ERROR 502: Bad Gateway.

但这确实:

C:\>wget --user-agent="Mozilla/5.0 (Windows NT 5.2; rv:2.0.1) Gecko/20100101 Firefox/4.0.1" http://www.segundamano.es/anuncios-madrid/
  SYSTEM_WGETRC = c:/progra~1/wget/etc/wgetrc
  syswgetrc = C:\Program Files (x86)\GnuWin32/etc/wgetrc
  --2013-10-16 10:06:29--  http://www.segundamano.es/anuncios-madrid/
  Resolving www.segundamano.es... 195.77.179.69, 213.4.96.70
  Connecting to www.segundamano.es|195.77.179.69|:80... connected.
  HTTP request sent, awaiting response... 200 OK
  Length: unspecified [text/html]
  Saving to: `index.html'
  [<=>] 178,588      267K/s   in 0.7s

  2013-10-16 10:06:33 (267 KB/s) - `index.html' saved [178588]
于 2013-10-16T17:10:01.840 回答