php - 我的正则表达式不知道何时停止

Question

我正在尝试匹配这个（特别是名称）：

<tr>
    <th class="name">Name:</th>
    <td>John Smith</td>
</tr>

像这样：

preg_match('/<th class="name">Name:<\/th>.+?<td>(.+)<\/td>/s', $a, $b);

但是，虽然它与名称匹配，但它不会停在名称的末尾。它继续运行另外 150 个左右的字符。为什么是这样？我只想匹配名称。

score 3 · Accepted Answer

使最后一个量词非贪婪：preg_match('/<th class="name">Name:<\/th>.+?<td>(.+?)<\/td>/s', $a, $b);

score 0 · Accepted Answer

不要使用正则表达式来解析 HTML，使用 DOMDocument 非常容易：

<?php 
$html = <<<HTML
<tr>
    <th class="name">Name:</th>
    <td>John Smith</td>
</tr>
<tr>
    <th class="name">Somthing:</th>
    <td>Foobar</td>
</tr>
HTML;

$dom = new DOMDocument();
@$dom->loadHTML($html);

$ret = array();
foreach($dom->getElementsByTagName('tr') as $tr) {
    $ret[trim($tr->getElementsByTagName('th')->item(0)->nodeValue,':')] = $tr->getElementsByTagName('td')->item(0)->nodeValue;
}

print_r($ret);
/*
Array
(
    [Name] => John Smith
    [Somthing] => Foobar
)
*/
?>

score 0 · Accepted Answer

preg_match('/<th class="name">Name:<\/th>\s*<td>(.+?)<\/td>/s', $line, $matches);

</th>仅匹配and之间的空格<td>，并且非贪婪匹配名称。

score 0 · Accepted Answer

这是你的比赛

preg_match(!<tr>\s*<th[^>]*>Name:</th>\s*<td>([^<]*)</td>\s*</tr>!s)

它会完美地工作。

score 0 · Accepted Answer

preg_match('/<th class="name">Name:<\/th>.+?<td>(?P<name>.*)<\/td>/s', $str, $match);

echo $match['name'];

php - 我的正则表达式不知道何时停止

5 回答 5

Related

Reference