2

我遇到了一个问题,我在我的 Windows 机器上本地运行两个网站a .ryan 和b .ryan)。我遇到的问题不会发生在实时环境(运行 CentOS7)上b .ryan中的脚本向.ryan发出CURL 请求:

* Rebuilt URL to: http://a.ryan/
* Hostname a.ryan was found in DNS cache
*   Trying 192.168.0.64...
* TCP_NODELAY set
* Connected to a.ryan (192.168.0.64) port 80 (#0)
> GET / HTTP/1.1
Host: a.ryan
User-Agent: Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; WOW64; Trident/6.0)
Accept: */*

* Operation timed out after 10000 milliseconds with 0 bytes received
* Curl_http_done: called premature == 1
* Closing connection 0

如您所见 - 连接超时。我在这里尝试了更长的持续时间(结果相同),尽管实际上它应该几乎是即时的。

我目前正在使用以下功能:

function getHTML($url) {
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_IPRESOLVE, CURL_IPRESOLVE_V4);
    curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10); 
    curl_setopt($ch, CURLOPT_TIMEOUT, 10);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
    curl_setopt($ch, CURLOPT_HEADER, 0);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, FALSE);
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
    curl_setopt($ch, CURLOPT_SSLVERSION, 3);
    curl_setopt($ch, CURLOPT_PROXY, '');
    curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, false);
    curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; WOW64; Trident/6.0)');
    curl_setopt($ch, CURLOPT_VERBOSE, true);
    curl_setopt($ch, CURLOPT_STDERR, fopen('curl.txt', 'w+'));
    $tmp = curl_exec($ch);
    curl_close($ch);
    if ($tmp != false) {
        return $tmp;
    }
}

诚然,这里有很多选项可能不需要存在 - 但这是尝试在线找到的多种解决方案的结果。澄清一下,当我使用时,我得到了与上面发布的完全相同的响应:

function getHTML($url) {
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10); 
    curl_setopt($ch, CURLOPT_TIMEOUT, 10);
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_VERBOSE, true);
    curl_setopt($ch, CURLOPT_STDERR, fopen('curl.txt', 'w+'));
    $tmp = curl_exec($ch);
    curl_close($ch);
    if ($tmp != false) {
        return $tmp;
    }
}

希望这可以让我了解我尝试使用 PHP Curl 方法解决此问题的设置。

当我在命令行上运行 Curl 时,它工作正常:

* Rebuilt URL to: a.ryan/
*   Trying 192.168.0.64...
* TCP_NODELAY set
* Connected to a.ryan (192.168.0.64) port 80 (#0)
> GET / HTTP/1.1
> Host: a.ryan
> User-Agent: curl/7.55.1
> Accept: */*
>
< HTTP/1.1 302 Moved Temporarily
< Server: nginx/1.12.0
< Date: Wed, 01 May 2019 11:34:12 GMT
< Content-Type: text/html; charset=UTF-8
< Transfer-Encoding: chunked
< Connection: keep-alive
< X-Powered-By: PHP/5.6.30
< Set-Cookie: PHPSESSID=9898j4cia9s888jn24gr4be8m5; path=/
< Expires: Thu, 19 Nov 1981 08:52:00 GMT
< Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
< Pragma: no-cache
< location: /home
<
* Connection #0 to host a.ryan left intact

我还禁用了这台机器上我的网络接口上的所有 IPv6 配置,因为我最初的印象是这个问题是由 IPv6 分辨率而不是 IPv4 引起的,但它没有任何区别。

如果有帮助,这是我的主机文件的副本。

# Copyright (c) 1993-2009 Microsoft Corp.
#
# This is a sample HOSTS file used by Microsoft TCP/IP for Windows.
#
# This file contains the mappings of IP addresses to host names. Each
# entry should be kept on an individual line. The IP address should
# be placed in the first column followed by the corresponding host name.
# The IP address and the host name should be separated by at least one
# space.
#
# Additionally, comments (such as these) may be inserted on individual
# lines or following the machine name denoted by a '#' symbol.
#
# For example:
#
#102.54.94.97   rhino.acme.com  # source server
#38.25.63.10    x.acme.com  # x client host
# localhost name resolution is handled within DNS itself.
#127.0.0.1  localhost
#::1    localhost
127.0.0.1   localhost.localdomain localhost MyPCName
127.0.0.1   a.ryan
127.0.0.1   b.ryan

编辑

忘了提 - 如果我在 CLI 中运行脚本,它也运行良好。所以它实际上特定于通过浏览器运行脚本。(使用Winginx服务网站)

4

1 回答 1

0

我想有可能网络服务器(或者可能是配置为阻止恶意 Web 漏洞扫描程序的愚蠢防火墙启发式算法?)已配置为阻止显然在用户代理上撒谎的请求,因为不起作用的请求在撒谎作为 Internet Explorer 10,显然不是,真正的 Internet Explorer GET 请求看起来像

GET / HTTP/1.1
Accept: text/html, application/xhtml+xml, */*
Accept-Language: nb-NO
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko
Accept-Encoding: gzip, deflate
Host: 127.0.0.1:9999
Connection: Keep-Alive

这与您的伪造请求有很多不同,而实际有效的请求是如实声称是curl/7.55.1

..如果您将用户代理更改为

curl_setopt($ch, CURLOPT_USERAGENT, 'libcurl/'.(curl_version()['version']).' PHP/'.PHP_VERSION);

? 甚至只是

curl_setopt($ch, CURLOPT_USERAGENT, 'curl/7.55.1');

?

于 2019-05-01T19:39:25.200 回答