php - 在 PHP 中使用 URL 获取元素的特定内容块

Question

可能重复：
如何使用 PHP 解析和处理 HTML？

我知道 file_get_contents(url) 方法，但我想要的是可能首先使用 file_get_contents(url) 来提取页面的内容，然后是否有一些方法/函数可以从您的内容中提取或获取特定的内容块使用 file_get_contents(url)？这是一个示例：

所以代码将是这样的：

$pageContent = file_get_contents('http://www.pullcontentshere.com/');

这将是$pageContent

<html> <body>
    <div id="myContent">
        <ul>    
            <li></li>
            <li></li>
            <li></li>
        </ul>
    </div> 
</body> </html>

也许您有什么建议或想法如何专门提取<div id="myContent">它的整个子代？

所以它会是这样的：

$content = function_here($pageContent);

所以输出会是这样的：

        <div id="myContent">
            <ul>    
                <li></li>
                <li></li>
                <li></li>
            </ul>
        </div>

答案非常感谢！

score 3 · Accepted Answer

另一种方法是使用正则表达式。

<?php

$string = '<html> <body> 
    <div id="myContent"> 
        <ul>     
            <li></li> 
            <li></li> 
            <li></li> 
        </ul> 
    </div>  
</body> </html>';

if ( preg_match ( '/<div id="myContent"(.*?)<\/div>/s', $string, $matches ) )
{
    foreach ( $matches as $key => $match )
    {
        echo $key . ' => ' . htmlentities ( $match ) . '<br /><br />';
    }
}
else
{
    echo 'No match';
}

?>

现场示例：http ://codepad.viper-7.com/WSoWCh

score 3 · Accepted Answer

您可以使用 nullpointr 的答案中解释的内置 SimpleXMLElement，也可以使用正则表达式。另一个我通常觉得很简单的解决方案是PHP Simple HTML DOM Parser。你可以在这个库中使用 jQuery 风格的选择器。您的代码的一个简单示例如下所示：

// Create DOM from url
$html = file_get_html('http://www.pullcontentshere.com');
// Use a selector to reach the content you want
$myContent = $html->find('div.myContent')->plaintext;

score 0 · Accepted Answer

您需要使用 XML 解析来解决您的问题。我会向您推荐 SimpleXML，它已经是 php 的一部分。这是一个例子：

$sitecontent = "
<html>   
   <body>
      <div>
         <ul>    
            <li></li>
            <li></li>
            <li></li>
         </ul>
      </div> 
   </body> 
 </html>";

 $xml = new SimpleXMLElement($sitecontent);
 $xpath = $xml->xpath('//div');

 print_r($xpath);

php - 在 PHP 中使用 URL 获取元素的特定内容块

3 回答 3

Related

Reference