我正在尝试从 Wikipedia 获取数据,但是每次反序列化都失败。
示例查询应从 Honda Civic 页面获取第 20 节:
<?php
exec("curl -s 'http://en.wikipedia.org/w/api.php?action=parse&format=php&page=Honda_Civic&prop=text§ion=20'", $output);
$value = "";
$first = true;
foreach ($output as $line) {
if ($first) {
$first = false;
} else {
$value .= "\n";
}
$value .= $line;
}
print("~~~\n");
print($value);
print("\n~~~\n");
print(unserialize($value));
print("~~~\n");
结果是:
~~~
a:1:{s:5:"parse";a:2:{s:5:"title";s:11:"Honda Civic";s:4:"text";a:1:{s:1:"*";s:1476:"<h4><span class="editsection">[<a href="/w/index.php?title=Honda_Civic&action=edit&section=1" title="Edit section: WTCC">edit</a>]</span> <span class="mw-headline" id="WTCC">WTCC</span></h4>
<p>Honda announced to enter the 2012 <a href="/wiki/World_Touring_Car_Championship" title="World Touring Car Championship">World Touring Car Championship</a> (WTCC) with a racer built on the 2012 Euro Civic 5 door hatchback. The car is powered by a 1.6-liter turbocharged engine, developed by Honda R&D, and will race later in Japan, China and Macau before a two car team join the 2013 championship racing.<sup id="cite_ref-1" class="reference"><a href="#cite_note-1"><span>[</span>1<span>]</span></a></sup><sup id="cite_ref-2" class="reference"><a href="#cite_note-2"><span>[</span>2<span>]</span></a></sup><br />
<strong class="error">Cite error: There are <code><ref></code> tags on this page, but the references will not show without a <code>{{Reflist}}</code> template or a <code><references /></code> tag (see the <a href="/wiki/Help:Cite_errors/Cite_error_refs_without_references" title="Help:Cite errors/Cite error refs without references">help page</a>).</strong></p>
<!--
NewPP limit report
Preprocessor visited node count: 146/1000000
Preprocessor generated node count: 1599/1500000
Post‐expand include size: 3103/2048000 bytes
Template argument size: 1880/2048000 bytes
Highest expansion depth: 12/40
Expensive parser function count: 0/500
-->
";}}}
~~~
~~~
是的,存在“引用错误”,但数据仍应反序列化。知道这里发生了什么吗?
如果我从我的真实脚本中运行它(相对于这里给出的简化脚本),我会得到相同的输出,但也会得到以下可能有用的信息:
unserialize(): Error at offset 1583 of 1587 bytes