0

我尝试获取 rss,但由于某种原因我得到了错误的数据:

$url = "http://rss.news.yahoo.com/rss/oddlyenough";
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 0);
$xml = curl_exec($ch);      
curl_close($ch);
echo htmlentities($xml, ENT_QUOTES, "UTF-8");

输出:

<!-- rc2.ops.ch1.yahoo.com uncompressed/chunked Sun Nov 25 15:57:06 UTC 2012 --> 

如果我尝试以其他方式加载这些数据,我会得到正确的数据。例如这个作品:

$xml = simplexml_load_file('http://rss.news.yahoo.com/rss/oddlyenough');
print "<ul>\n";
foreach ($xml->channel->item as $item){
  print "<li>$item->title</li>\n";
}
print "</ul>";

你能告诉我使用 curl 的代码有什么问题吗?

4

1 回答 1

2

你正在Location遇到障碍。

添加此选项:

  curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);

以便拥有:

$url = "http://rss.news.yahoo.com/rss/oddlyenough";
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_HEADER, 0);
$xml = curl_exec($ch);      
curl_close($ch);
echo htmlentities($xml, ENT_QUOTES, "UTF-8");

细节

当您运行上述代码时,您从 Yahoo! 收到的第一个答案!是:

HTTP/1.0 301 Moved Permanently
Date: Sun, 25 Nov 2012 16:31:36 GMT
P3P: policyref="http://info.yahoo.com/w3c/p3p.xml", CP="CAO DSP COR CUR ADM DEV TAI PSA PSD IVAi IVDi CONi TELo OTPi OUR DELi SAMi OTRi UNRi PUBi IND PHY ONL UNI PUR FIN COM NAV INT DEM CNT STA POL HEA PRE LOC GOV"
Cache-Control: max-age=3600, public
Location: http://news.yahoo.com/rss/oddlyenough
Vary: Accept-Encoding
Content-Type: text/html; charset=utf-8
Age: 1586
Content-Length: 81
Via: HTTP/1.1 rc4.ops.ch1.yahoo.com (YahooTrafficServer/1.20.10 [cHs f ])
Server: YTS/1.20.10

<!-- rc4.ops.ch1.yahoo.com uncompressed/chunked Sun Nov 25 16:31:36 UTC 2012 -->

它告诉你使用新地址http://news.yahoo.com/rss/oddlyenough

实际上,如果您直接使用新地址,您的原始代码可以工作(直到他们再次更改地址,也就是说......)并且速度更快,只发出一个请求而不是两个请求。

于 2012-11-25T16:55:34.163 回答