对我来说很好,例如
<?php
$doc = new DOMDocument;
$doc->loadhtml(data());
foreach( $doc->getElementsByTagName('h2') as $h2 ) {
foreach( $h2->getElementsByTagName('a') as $a ) {
echo $a->getAttribute('href'), ': ', $a->nodeValue, "\n";
}
}
function data() {
return <<< eoh
<html>
<head><title>...</title></head>
<body>
<h2><a href="link1">header 1</a></h2>
<p>yadda yadda</p>
<h2><a href="link2">header 2</a></h2>
<p>yadda yadda</p>
<h2><a href="link3">header 3</a></h2>
<p>yadda yadda</p>
</body>
</html>
eoh;
}
但我发现为此使用XPath更容易,
例如
<?php
$doc = new DOMDocument;
$doc->loadhtml(data());
$xpath = new DOMXPath($doc);
foreach( $xpath->query('/html/body//h2/a') as $a) {
echo $a->getAttribute('href'), ": ", $a->nodeValue, "\n";
}
function data() {
return <<< eoh
<html>
<head><title>...</title></head>
<body>
<h2><a href="link1">header 1</a></h2>
<p>yadda yadda</p>
<h2><a href="link2">header 2</a></h2>
<p>yadda yadda</p>
<h2><a href="link3">header 3</a></h2>
<p>yadda yadda</p>
</body>
</html>
eoh;
}
印刷
link1: header 1
link2: header 2
link3: header 3