php - 如何获取页面内容

Question

我正在尝试为我的网站制作类似功能的最新新闻。为此，我制作了一个网络爬虫，并且到目前为止通过执行以下操作能够从页面收集链接

$dom = new domDocument;
@$dom->loadHTML(file_get_contents($url));
$dom->preserveWhiteSpaces = false;
$linksToStore = $dom->getElementsByTagName('a');

foreach($linksToStore as $tag){
    $links[$tag->getAttribute('href')]= $tag->childNodes->item(0)->nodeValue;
}

我如何从与特定域相关的链接指向的页面中获取内容，在我的情况下是“医疗”？

score 0 · Accepted Answer

使用这个http://simplehtmldom.sourceforge.net/库从页面中提取内容。选择器的工作方式与 jQuery 相同，这使得提取内容更加熟悉和高效。

另外，请查看此http://davidwalsh.name/php-notifications以了解更多信息

php - 如何获取页面内容

1 回答 1

Related

Reference