php - 无法打开流：从其他网站获取内容时连接超时

Question

2 年多来，我已经与另一个网站达成协议，以便能够通过我的脚本使用 Simple_html_DOM 获取他们的内容。现在突然没有任何警告并且仍然与他们签订合同，failed to open stream: Connection timed out无论我使用什么 - simple_html_DOM、cURL、file_get_content。我什至尝试了 snoopy 库来模拟网络浏览器，但仍然连接超时。他们以某种方式阻止连接。它不是 IP 阻塞，就像我从几个不同的服务器上尝试过的一样，结果相同。他们的网站在我的网络浏览器中加载正常，所以那里没有问题。有没有其他方法可以从该网站获取内容？当我为它付钱时，他们在拿走了我的钱后公然无视我。

score 3 · Accepted Answer

服务器可能会阻止基于（不存在有效的）用户代理标头（User-Agent:）的请求。基本上，这个标头自我识别到服务器它是什么：浏览器、机器人、蜘蛛或应用程序等。

您可以尝试使用 cURL 发送服务器期望从典型浏览器获得的相同类型的标头，使用curl_setopt和CURLOPT_USERAGENT选项（此处的文档）。

$url = "https://example.com";
// we're going to impersonate Chrome 74 on MacOS in this example.
$user_agent = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.131 Safari/537.36"; 
$ch = curl_init();
// this is where we set the option to send the user agent header
curl_setopt($ch, CURLOPT_USERAGENT, $user_agent);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_URL,$url);
$result=curl_exec($ch);

如果这仍然不起作用，请确保您不需要 cookie 或登录凭据。

score 1 · Accepted Answer

如果你想用file_get_content()curl 代替。你可以这样做：

$options  = array('http' => array('user_agent' => 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.157 Safari/537.36'));
$context  = stream_context_create($options);
$response = file_get_contents('http://domain/path/to/uri', false, $context);

php - 无法打开流：从其他网站获取内容时连接超时

2 回答 2

Related

Reference