php - 如何从维基百科获取不同公司的摘要？

Question

我正在尝试从维基百科获取用户输入公司的摘要段落。

例如，如果用户输入谷歌，我需要显示谷歌的摘要段落。

我目前使用的代码：

// action=parse: get parsed text
// page=$input
// format=json: in json format
// prop=text: send the text content of the article
// section=0: top content of the page

$url = 'http://en.wikipedia.org/w/api.php?action=parse&page=$input&format=json&prop=text&section=0';
$ch = curl_init($url);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_USERAGENT, "TestScript"); // required by wikipedia.org server; use YOUR user agent with YOUR contact information. (otherwise your IP might get blocked)
$c = curl_exec($ch);

$json = json_decode($c);

$content = $json->{'parse'}->{'text'}->{'*'}; // get the main text content of the query (it's parsed HTML)

// pattern for first match of a paragraph
$pattern = '#<p>(.*)</p>#Us'; // http://www.phpbuilder.com/board/showthread.php?t=10352690
if(preg_match($pattern, $content, $matches))
{
    // print $matches[0]; // content of the first paragraph (including wrapping <p> tag)
    print strip_tags($matches[1]); // Content of the first paragraph without the HTML tags.
}

如果返回“参考：[4]”，则有效，$input = "Zynga"但无效。$input = "Google"

score 0 · Accepted Answer

您可以改用action=query&prop=extracts该exintro=选项。例如 https://en.wikipedia.org/w/api.php?action=query&prop=extracts&format=json&exlimit=1&exintro=&explaintext=&titles=Google for Google。

这应该会删除大部分不需要的格式，例如引用。

php - 如何从维基百科获取不同公司的摘要？

1 回答 1

Related

Reference