0

我想在 div 中提取 div 标签...

post.php 文件:

<body>
<div class="home">

<div id="post_message_14674248">Content number 14674248</div>
<div id="post_message_14674255">Content number 14674255</div>
<div id="post_message_14674278">Content number 14674278</div>
<div id="post_message_14674279">Content number 14674279</div>
<div id="post_message_14674283">Content number 14674283</div>
<div id="post_message_14674290">Content number 14674290</div>
.
.
.
.
</div>
</body>

extract.php 文件:

<?php 
$html = file_get_contents("post.php");
   $pattern = "/(<div id=\"post_message_)(.*)(<\/div>)/";
   preg_match_all($pattern, $html, $matches);
   print_r($matches);

?>

但它给了我一个空数组:

Array ( [0] => Array ( ) [1] => Array ( ) [2] => Array ( ) [3] => Array ( ) ) 

我想要这样:

Content number 14674248
Content number 14674255
Content number 14674278
Content number 14674279
Content number 14674283
Content number 14674290

有什么帮助吗?

4

2 回答 2

1
$html = new DOMDocument(); 
$html->loadHTMLFile("post.php");
$xpath = new DOMXPath($html);
$filtered = $xpath->query("//div[@class='home']/div");

foreach($filtered as $one){
    echo $one->nodeValue."\n";
}
于 2012-09-04T18:18:09.143 回答
0

验证file_get_contents()是否正常工作。如果我运行以下代码,我会得到结果:

<?php 
$html = '<div class="home">

<div id="post_message_14674248">Content number 14674248</div>
<div id="post_message_14674255">Content number 14674255</div>
<div id="post_message_14674278">Content number 14674278</div>
<div id="post_message_14674279">Content number 14674279</div>
<div id="post_message_14674283">Content number 14674283</div>
<div id="post_message_14674290">Content number 14674290</div>
</div>
</body>';
   $pattern = "/(<div id=\"post_message_)(.*)(<\/div>)/";
   preg_match_all($pattern, $html, $matches);
   print_r($matches);

?>

您可能还想将正则表达式更改为如下所示:

$pattern = "/<div id=\"post_message_.*?>(.*?)<\/div>/";
于 2012-09-04T18:08:12.360 回答