-4

可能重复:
如何使用 PHP 解析和处理 HTML?

我正在寻找一个减少 html hx (h1, h2, h3, ..., h6) 标签的 php 函数。

  • h1 变成 h2
  • h2 变成 h3 以此类推
  • ...
  • h6 被 ' ' 取代

你知道这样的功能吗?

这就是我开始剥离 h6 标签的方式:

$string = preg_replace('#<(?:/)?\s*h6\s*>#', ' ', $string);
4

1 回答 1

2

这是一个 for DOM,它遍历所有映射,然后替换标签或复制子项。

<?php

// New tag mappings:
//     null => extract childs and push them into parent contrainer
// Make sure that they are in this order, otherwise they would match wrongly
// between each another
$mapping = array(
    'h6' => null,
    'h5' => 'h6',
    'h4' => 'h5',
    'h3' => 'h4',
    'h2' => 'h3',
    'h1' => 'h2'
);

// Load document
$xml = new DOMDocument();
$xml->loadHTMLFile('http://stackoverflow.com/questions/12883009/php-code-to-decrease-html-hx-tags') or die('Failed to load');

$xPath = new DOMXPath( $xml);

foreach( $mapping as $original => $new){
    // Load nodes
    $nodes = $xPath->query( '//' . $original);

    // This is a critical error and should NEVER happen
    if( $nodes === false){
        die( 'Malformed expression: //' . $original);
    }

    echo $original . ' has nodes: ' . $nodes->length . "\n";

    // Process each node
    foreach( $nodes as $node){
        if( $new == null){
            // Append all the childs before self and remove self afterwards
            foreach( $node->childNodes as $child){
                $node->parentNode->insertBefore( $child->cloneNode( true), $node);
            }
            $node->parentNode->removeChild( $node);

        } else {
            // Create new empty node and push all childrens to it
            $newNode = $xml->createElement( $new);
            foreach( $node->childNodes as $child){
                $newNode->appendChild( $child);
            }
            $node->parentNode->replaceChild( $newNode, $node);
        }
    }
}

echo $xml->saveHTML();

你也可以做一些xPath优化,比如使用//*or//h3|//h2和检查DOMElement::tagName,但我希望这是直截了当的。


编辑:仅通过节点一次且不关心顺序的解决方案:

<?php
// The beginning (everything to the first foreach loop) remains the same
// Load nodes
$nodes = $xPath->query( '//*');

// This is a critical error and should NEVER happen
if( $nodes === false){
    die( 'Malformed expression: //' . $original);
}

// Process each node
foreach( $nodes as $node){
    // Check correct $node class
    if( !($node instanceof DOMElement)){
        continue;
    }
    $tagName = $node->tagName;  

    // Do we have a mapping?
    if( !array_key_exists( $tagName, $mapping)){
        continue;
    }
    $new = $mapping[$tagName];
    echo 'Has element: ' . $tagName . ' => ' . $new . "\n";

    if( $new == null){
        // Append all the childs before self and remove self afterwards
        foreach( $node->childNodes as $child){
            $node->parentNode->insertBefore( $child->cloneNode( true), $node);
        }
        $node->parentNode->removeChild( $node);

    } else {
        // Create new empty node and push all childrens to it
        $newNode = $xml->createElement( $new);
        foreach( $node->childNodes as $child){
            $newNode->appendChild( $child);
        }
        $node->parentNode->replaceChild( $newNode, $node);
    }
}

echo $xml->saveHTML();

我能想到的最后一个优化是使用:

$xPathQuery = '//' . implode( array_keys($mapping), '|//');
$nodes = $xPath->query( $xPathQuery);
于 2012-10-14T15:01:01.553 回答