-1

我正在尝试从下面的 html 中获取 part#1AMTB00186 的每个实例的特定限定符。我需要它来返回4cyl 2.3L - F23A1, Balance Shaft4cyl 2.3L - F23A1, CAM。我相信我的正则表达式是贪婪的,但我不知道如何使它不贪婪。它始终显示 的第一个限定符2.3L L4, Engine-F23A1。我在用:

partno="1AMTB00186";

$pattern_short ='{<td\s+class="qualifier"\s*>.*<div>([^<]+)</div>.*' . $partno . '}sU';
$matchcount = preg_match_all($pattern_short, $data, $matches);
<tr>
<tr id="61" class="findme">
<td class="productName">
<h3>Air and Fuel Delivery - Fuel Pumps and Related Components</h3>
<br>Electric Fuel</td>
<td class="qualifier"><div>2.3L L4, Engine-F23A1</div></td>
<td class="partNum">1AMFP00020</td>
</tr>
<tr id="62" class="odd findme">
<td class="productName">
<h3>Air and Fuel Delivery - Fuel Pumps and Related Components</h3>
<br>Electric Fuel</td>
<td class="qualifier"><div>3.0L V6, Engine-J30A1</div></td>
</tr>
<tr id="63" class="findme">
<td class="productName">
<h3>Belts - Timingbelts</h3>
<br>Timingbelt</td>
<td class="qualifier"><div>4cyl 2.3L - F23A1, Balance Shaft</div></td>
<td class="partNum">1AMTB00186</td>
</tr>
<tr id="64" class="odd findme">
<td class="productName">
<h3>Belts - Timingbelts</h3>
<br>Timingbelt</td>
<td class="qualifier"><div>4cyl 2.3L - F23A1, CAM</div></td>
<td class="partNum">1AMTB00244</td>
</tr>
</tr>
<tr id="63" class="findme">
<td class="productName">
<h3>Belts - Timingbelts</h3>
<br>Timingbelt</td>
<td class="qualifier"><div>4cyl 2.3L - F23A1, CAM</div></td>
<td class="partNum">1AMTB00186</td>
</tr>
<tr id="65" class="findme">
<td class="productName">
<h3>Belts - Timingbelts</h3>
<br>Timingbelt</td>
<td class="qualifier"><div>V6 3.0L - J30A1, CAM</div></td>
<td class="partNum">1AMTB00286</td>
</tr>
<tr id="66" class="odd findme">
<td class="productName">
<h3>Brakes - Disc Brake Pad and Hardware Kit</h3>
<br>Front; 7345-D465 Ceramic</td>
<td class="qualifier"><div>L4 2.3L</div></td>
<td class="partNum">1AMV300465</td>
</tr>

谢谢你

4

1 回答 1

2

严肃地说,请停止尝试使用正则表达式解析大块 HTML 代码。这是工作的错误工具。

相反,PHP 有一个非常好的内置 DOM 解析器。这里有一个关于如何使用它的很好的解释:如何使用 dom php 解析器(如果你看的话,还有很多其他的教程)。

简而言之,你需要这样的东西:

libxml_use_internal_errors(true);
$dom = new DOMDocument;
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$query = '//tr/td[@class="partNum" and text() = "1AMTB00186"]/preceding-sibling::td[@class="qualifier"]';
foreach ($xpath->query($query) as $qualifier) {
    echo $qualifier->nodeValue, PHP_EOL;
}

XPath$query解释说:

匹配所有具有类“限定符”的 TD 元素,这些元素在任何具有类“partNum”和内容“1AMTB00186”的 TD 元素之前,它们是 TR 元素的直接子元素

编写 XPath 的另一种变体是

//tr/td[
    @class="qualifier" and following-sibling::td[
        @class="partNum" and text() = "1AMTB00186"
    ]
]
于 2013-05-03T13:28:42.330 回答