可能重复:
如何使用 PHP 解析和处理 HTML?
我正在寻找一个减少 html hx (h1, h2, h3, ..., h6) 标签的 php 函数。
- h1 变成 h2
- h2 变成 h3 以此类推
- ...
- h6 被 ' ' 取代
你知道这样的功能吗?
这就是我开始剥离 h6 标签的方式:
$string = preg_replace('#<(?:/)?\s*h6\s*>#', ' ', $string);
可能重复:
如何使用 PHP 解析和处理 HTML?
我正在寻找一个减少 html hx (h1, h2, h3, ..., h6) 标签的 php 函数。
你知道这样的功能吗?
这就是我开始剥离 h6 标签的方式:
$string = preg_replace('#<(?:/)?\s*h6\s*>#', ' ', $string);
这是一个 for DOM
,它遍历所有映射,然后替换标签或复制子项。
<?php
// New tag mappings:
// null => extract childs and push them into parent contrainer
// Make sure that they are in this order, otherwise they would match wrongly
// between each another
$mapping = array(
'h6' => null,
'h5' => 'h6',
'h4' => 'h5',
'h3' => 'h4',
'h2' => 'h3',
'h1' => 'h2'
);
// Load document
$xml = new DOMDocument();
$xml->loadHTMLFile('http://stackoverflow.com/questions/12883009/php-code-to-decrease-html-hx-tags') or die('Failed to load');
$xPath = new DOMXPath( $xml);
foreach( $mapping as $original => $new){
// Load nodes
$nodes = $xPath->query( '//' . $original);
// This is a critical error and should NEVER happen
if( $nodes === false){
die( 'Malformed expression: //' . $original);
}
echo $original . ' has nodes: ' . $nodes->length . "\n";
// Process each node
foreach( $nodes as $node){
if( $new == null){
// Append all the childs before self and remove self afterwards
foreach( $node->childNodes as $child){
$node->parentNode->insertBefore( $child->cloneNode( true), $node);
}
$node->parentNode->removeChild( $node);
} else {
// Create new empty node and push all childrens to it
$newNode = $xml->createElement( $new);
foreach( $node->childNodes as $child){
$newNode->appendChild( $child);
}
$node->parentNode->replaceChild( $newNode, $node);
}
}
}
echo $xml->saveHTML();
你也可以做一些xPath
优化,比如使用//*
or//h3|//h2
和检查DOMElement::tagName
,但我希望这是直截了当的。
<?php
// The beginning (everything to the first foreach loop) remains the same
// Load nodes
$nodes = $xPath->query( '//*');
// This is a critical error and should NEVER happen
if( $nodes === false){
die( 'Malformed expression: //' . $original);
}
// Process each node
foreach( $nodes as $node){
// Check correct $node class
if( !($node instanceof DOMElement)){
continue;
}
$tagName = $node->tagName;
// Do we have a mapping?
if( !array_key_exists( $tagName, $mapping)){
continue;
}
$new = $mapping[$tagName];
echo 'Has element: ' . $tagName . ' => ' . $new . "\n";
if( $new == null){
// Append all the childs before self and remove self afterwards
foreach( $node->childNodes as $child){
$node->parentNode->insertBefore( $child->cloneNode( true), $node);
}
$node->parentNode->removeChild( $node);
} else {
// Create new empty node and push all childrens to it
$newNode = $xml->createElement( $new);
foreach( $node->childNodes as $child){
$newNode->appendChild( $child);
}
$node->parentNode->replaceChild( $newNode, $node);
}
}
echo $xml->saveHTML();
我能想到的最后一个优化是使用:
$xPathQuery = '//' . implode( array_keys($mapping), '|//');
$nodes = $xPath->query( $xPathQuery);