2

如果我尝试阅读网站的源代码,有时会得到以下信息(显示的示例 URL):

Warning: file_get_contents(http://www.iwantoneofthose.com/gift-novelty/golf-ball-finding-glasses/10602617.html)
[function.file-get-contents]: failed to open stream: HTTP request failed!
HTTP/1.1 500 Internal Server Error in /home/public_html/pages/scrape.html on line 165

然而,URL 本身就很好.. 为什么会发生这种情况?

我尝试了以下解决方法建议,但结果相同:

$opts = array('http'=>array('header' => "User-Agent:MyAgent/1.0\r\n"));
$context = stream_context_create($opts);
$header = file_get_contents('https://www.example.com',false,$context);

这让我现在很困惑...

4

2 回答 2

2

问题出在您的 User-Agent 标头中。这对我有用:

$opts = array('http'=>array('header' => "User-Agent:Mozilla/5.0 (Windows NT 6.2) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.75 Safari/537.1\r\n"));
$context = stream_context_create($opts);
$header = file_get_contents('http://www.iwantoneofthose.com/gift-novelty/golf-ball-finding-glasses/10602617.html',false,$context);
于 2012-08-14T12:34:02.607 回答
2

我不知道确切的原因,但是在使用某些服务器时file_get_contents失败了。但是你有一个选择;

$fp = fsockopen("www.iwantoneofthose.com", 80, $errn, $errs);
$out  = "GET /gift-novelty/golf-ball-finding-glasses/10602617.html HTTP/1.1\r\n";
$out .= "Host: www.iwantoneofthose.com\r\n";
$out .= "User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:15.0) Gecko/20100101 Firefox/15.0\r\n";
$out .= "Connection: close\r\n";
$out .= "\r\n";
fwrite($fp, $out);

$response = "";
while ($line = fread($fp, 4096)) {
    $response .= $line;
} 
fclose($fp);


$response_body = substr($response, strpos($response, "\r\n\r\n") + 4);
// or
list($response_headers, $response_body) = explode("\r\n\r\n", $response, 2);

print $response_body;
于 2012-08-14T14:36:38.737 回答