php - 使用 DOM 在 HTML 中查找文本行/字符串

Question

我有一些纯文本/HTML 内容，如下所示：

Title: Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Snippet: Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Category: Lorem ipsum dolor sit amet, consectetur adipiscing elit.

我只想匹配上面写着“片段： ”的行以及它后面的文本，但仅在该行上，没有其他内容，并且还使搜索不区分大小写。我尝试使用正则表达式，但最终我想现在尝试使用 DOMDocument，我该怎么做？

score 2 · Accepted Answer

如果涉及 DOM，请参阅我在评论中链接的副本。

否则你可能只是寻找一个正则表达式：

$line = preg_match('~(^Snippet:.*$)~m', $text, $matches) ? $matches[1] : NULL;

演示和正则表达式解释：

~  -- delimiter
 (  -- start match group 1
  ^  -- start of line
    Snippet:  -- exactly this text
    .*  -- match all but newline
  $  -- end of line
 )  -- end match group 1
~  -- delimiter
m  -- multiline modifier (^ matches begin of line, $ end of line)

score 1 · Accepted Answer

我不知道你的问题的一些细节，所以我的回答可能不合适。您可以根据需要解析的内容的大小来决定这不是一个选项。此外，从问题中不清楚 html 内容的位置，这就是为什么我编写了这个不使用 DOM 解析的解决方案。

一种可能的解决方案可能是获取要在数组中解析的行。之后，您可以过滤数组，从结果中删除与您的规则不匹配的行。

一个样本是：

//this is the content
$text = 'Title: Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Snippet: Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Category: Lorem ipsum dolor sit amet, consectetur adipiscing elit.';

//get the lines from your input as an array.. you could acheive this in a different way if, for example, you are reading from a file
$lines = explode(PHP_EOL, $text);

// apply a cusom function to filter the lines (remove the ones that don't match your rule)
$results = array_filter($lines, 'test_content');

//show the results
echo '<pre>';
print_r($results);
echo '</pre>';

//custom function here:
function test_content($line)
{
    //case insensitive search, notice stripos; 
    // type strict comparison to be sure that it doesn't fail when the element is found right at the start
    if (false !== stripos($line, 'Snippet'))
    {
        return true;
    }
    return false;//these lines will be removed 
}

这段代码将只返回 $results 数组中的一个元素，即第二行

你可以在这里看到它：http: //codepad.org/220BLjEk

php - 使用 DOM 在 HTML 中查找文本行/字符串

2 回答 2

Related

Reference