这是PHPDOMDocument
类的解决方案。我什至结合了您的逻辑来检查必需/可选属性:
// Load up your HTML
$doc = new DOMDocument;
$doc->loadHTML( $html);
// Define attributes that we are looking for in name => required pairs
$attributes = array( 'href' => true, 'rel' => false, 'target' => true, 'media' => false);
$parsed_tags = array();
// Iterate over all of the <a> tags
foreach( $doc->getElementsByTagName( 'a') as $a) {
$tag_attributes = array();
foreach( $attributes as $name => $required) {
if( !$a->hasAttribute( $name)) {
if( $required) {
echo 'Error, tag is required to have ' . $name . ' attribute and it is missing' . "\n";
continue 2;
}
} else {
// Has the attribute, required or not lets grab it
$tag_attributes[$name] = $a->getAttribute( $name);
}
}
$parsed_tags[] = $tag_attributes;
}
使用此 HTML 字符串:
$html = '<a href="http://www.google.com" rel="nofollow" target="_blank">Google</a><a href="http://www.google.com" rel="follow" target="_blank">Google</a><a href="http://www.google.com" target="_blank">Google</a>';
这产生:
Array
(
[0] => Array
(
[href] => http://www.google.com
[rel] => nofollow
[target] => _blank
)
[1] => Array
(
[href] => http://www.google.com
[rel] => follow
[target] => _blank
)
[2] => Array
(
[href] => http://www.google.com
[target] => _blank
)
)
请注意,使用此解决方案,因为我正在检查所需的属性是否存在,continue 2;
如果不存在则执行这意味着<a>
跳过没有所需属性的标签,如本演示所示,其中标签<a href="http://www.google.com">Google</a>
输出我输入的错误字符串in,但不包含在输出数组中。