php - PHP - 获取两个元素的内部 HTML 代码

Question

我目前处于过渡过程中，我想为我现有的网站制作 CMS。直到现在（几年），我一直在生成和保存完整的 html 文件，并且我想将这些页面的内容存储在数据库中。我认为我的运气是，我想从每个 html 中获取的两个元素在一个 html 文件中是唯一的，并且在所有文件中都是相同的。我试过这个：

if ($handle = opendir('.')) {
    while (false !== ($entry = readdir($handle))) {
        if ($entry != "." && $entry != "..") {
            $string= file_get_contents($entry);
            $pattern = "/<h1>(.*?)<\/h1>/";
            preg_match_all($pattern, $string, $uname);
            $pattern = '/<p class=\"user_info\"><strong>(.*?)<\/strong><\/p>/';
            preg_match_all($pattern, $string, $udesc);
            echo "NAME: ".$uname[1][0]."<br>";
            echo "DESC: ".$udesc[1][0]."<br>";
            //MYSQL SAVING WILL GO HERE
        }
    }
    closedir($handle);
}

上面的代码提取 (h1)NAME(/h1) （想象一下 (==< 和 )==>) 部分，而不是 (p class="user_info")(strong)CONTENT(/strong)(/p) 部分，它只是空白。

我也尝试过不同的方法：

if ($handle = opendir('.')) {
    while (false !== ($entry = readdir($handle))) {
        if ($entry != "." && $entry != "..") {
            $string= file_get_contents($entry);
            $doc = new DOMDocument();
            $doc->loadHTML($string);
            $h1 = $doc->getElementsByTagName('h1')->item(0)->textContent;
            echo "NAME: ".$h1."<br>";
            $p = $doc->saveHtml($doc->getElementsByTagName('p')->item(0)); // $p = $doc->getElementsByTagName('p')->item(0)->textContent; loads content, just without html tags, so I can not use it... :S
            echo "DESC: ".$p."<br>";
            //MYSQL SAVING WILL GO HERE
        }
    }
    closedir($handle);
}

上面的代码有效，但不如预期。我需要完整的段落 HTML 代码，而不仅仅是文本。我也试过 $doc->savehtml()，还是没有。

请帮助，并提前感谢！

score 0 · Accepted Answer

消除->textContent

$h1 = $doc->saveHtml($doc->getElementsByTagName('h1')->item(0));
echo "NAME: ".$h1."<br>";
$p = $doc->saveHtml($doc->getElementsByTagName('p')->item(0));

php - PHP - 获取两个元素的内部 HTML 代码

1 回答 1

Related

Reference