php - PHP：来自 cURL、HTML 扫描的数据

Question

我如何扫描 html 页面，以获取某个 div 中的文本？

score 2 · Accepted Answer

// Create a DOM object from a URL
$html = file_get_html('http://www.google.com/');    

// Find all <div> which attribute id=foo
$ret = $html->find('div[id=foo]');

score 0 · Accepted Answer

您可以按照其他人的建议使用内置功能，或者您可以尝试将 Simple HTML DOM Parser 实现为一个简单的 PHP 类和一些辅助函数。它支持 CSS 选择器样式的屏幕抓取（例如在 jQuery 中），可以处理无效的 HTML，甚至提供熟悉的界面来操作 DOM。

值得在http://simplehtmldom.sourceforge.net/上查看它

score 0 · Accepted Answer

0

preg_match()匹配您想要的子字符串或使用 dom/xml。

于 2009-12-28T20:29:19.033 回答

score 0 · Accepted Answer

您也可以使用DOMDocument该类来执行此操作。

用法非常简单：

$dom = new DOMDocument();
$dom->loadHTML(file_get_contents($url));

// Example:
$dom->getElementById('foo');

文档在这里。

可以在此处找到实际使用的示例。

php - PHP：来自 cURL、HTML 扫描的数据

4 回答 4

Related

Reference