php - 在不更改 HTML 的情况下操作 HTML 字符串的内容

Question

如果我有一个 HTML 字符串，可能是这样的......

<h2>Header</h2><p>all the <span class="bright">content</span> here</p>

而且我想操纵字符串，以便所有单词都被反转，例如......

<h2>redaeH</h2><p>lla eht <span class="bright">tnetnoc</span> ereh</p>

我知道如何从 HTML 中提取字符串并通过传递给函数并获得修改后的结果来操作它，但是我将如何在保留 HTML 的同时这样做呢？

我更喜欢非语言特定的解决方案，但如果它必须是特定于语言的，那么了解 php/javascript 会很有用。

编辑

我还希望能够操作跨越多个 DOM 元素的文本......

Quick<em>Draw</em>McGraw

warGcM<em>warD</em>kciuQ

另一个编辑

目前，我正在考虑以某种方式用唯一标记替换所有 HTML 节点，同时将原件存储在数组中，然后进行忽略标记的操作，然后用数组中的值替换标记。

这种方法似乎过于复杂，我不知道如何在不使用 REGEX 的情况下替换所有 HTML，我了解到你可以去堆栈溢出监狱岛。

另一个编辑

我想在这里澄清一个问题。我希望文本操作发生在xDOM 元素的数量上 - 例如，如果我的公式随机移动单词中间的字母，让开头和结尾保持不变，我希望能够做到这一点......

<em>going</em><i>home</i>

转换为

<em>goonh</em><i>gmie</i>

因此 HTML 元素保持不变，但内部的字符串内容会goinghome以操作公式选择的任何方式进行操作（作为一个整体 - 在本示例中传递给操作公式）。

score 1 · Accepted Answer

如果你想在不改变文本的情况下达到类似的视觉效果，你可以用 css 作弊，用

h2, p {
  direction: rtl;
  unicode-bidi: bidi-override;
}

这将反转文本

小提琴示例：http: //jsfiddle.net/pn6Ga/

score 1 · Accepted Answer

嗨，我很久以前就遇到过这种情况，我使用了以下代码。这是一个粗略的代码

<?php
function keepcase($word, $replace) {
   $replace[0] = (ctype_upper($word[0]) ? strtoupper($replace[0]) : $replace[0]);
   return $replace;
}

// regex - match the contents grouping into HTMLTAG and non-HTMLTAG chunks
$re = '%(</?\w++[^<>]*+>)                 # grab HTML open or close TAG into group 1
|                                         # or...
([^<]*+(?:(?!</?\w++[^<>]*+>)<[^<]*+)*+)  # grab non-HTMLTAG text into group 2
%x';

$contents = '<h2>Header</h2><p>the <span class="bright">content</span> here</p>';

// walk through the content, chunk, by chunk, replacing words in non-NTMLTAG chunks only
$contents = preg_replace_callback($re, 'callback_func', $contents);

function callback_func($matches) { // here's the callback function
    if ($matches[1]) {             // Case 1: this is a HTMLTAG
        return $matches[1];        // return HTMLTAG unmodified
    }
    elseif (isset($matches[2])) {  // Case 2: a non-HTMLTAG chunk.
                                   // declare these here
                                   // or use as global vars?
        return preg_replace('/\b' . $matches[2] . '\b/ei', "keepcase('\\0', '".strrev($matches[2])."')",
            $matches[2]);
    }
    exit("Error!");                // never get here
}
echo ($contents);
?>

score 0 · Accepted Answer

使用可以为您提供 DOM API 的东西解析 HTML。

编写一个循环遍历元素的子节点的函数。

如果节点是文本节点，则将数据作为字符串获取，将其拆分为单词，反转每个单词，然后将其分配回去。

如果节点是元素，则递归到您的函数中。

score 0 · Accepted Answer

可以使用jquery吗？

$('div *').each(function(){
    text = $(this).text();
    text = text.split('');
    text = text.reverse();
    text = text.join('');
    $(this).text(text);
});

见这里 - http://jsfiddle.net/GCAvb/

score 0 · Accepted Answer

我实现了一个似乎运行良好的版本 - 尽管我仍然使用（相当通用和伪劣的）正则表达式从文本中提取 html 标记。现在在注释的 javascript 中：

方法

/**
* Manipulate text inside HTML according to passed function
* @param html the html string to manipulate
* @param manipulator the funciton to manipulate with (will be passed single word)
* @returns manipulated string including unmodified HTML
*
* Currently limited in that manipulator operates on words determined by regex
* word boundaries, and must return same length manipulated word
*
*/

var manipulate = function(html, manipulator) {

  var block, tag, words, i,
    final = '', // used to prepare return value
    tags = [], // used to store tags as they are stripped from the html string
    x = 0; // used to track the number of characters the html string is reduced by during stripping

  // remove tags from html string, and use callback to store them with their index
  // then split by word boundaries to get plain words from original html
  words = html.replace(/<.+?>/g, function(match, index) {
    tags.unshift({
      match: match,
      index: index - x
    });
    x += match.length;
    return '';
  }).split(/\b/);

  // loop through each word and build the final string
  // appending the word, or manipulated word if not a boundary
  for (i = 0; i < words.length; i++) {
    final += i % 2 ? words[i] : manipulator(words[i]);
  }

  // loop through each stored tag, and insert into final string
  for (i = 0; i < tags.length; i++) {
    final = final.slice(0, tags[i].index) + tags[i].match + final.slice(tags[i].index);
  }

  // ready to go!
  return final;

};

上面定义的函数接受一个 HTML 字符串，以及一个对字符串中的单词进行操作的操作函数，无论它们是否被 HTML 元素分割。

它的工作原理是首先删除所有 HTML 标记，并将标记与从中获取的索引一起存储，然后操作文本，然后以相反的顺序将标记添加到其原始位置。

测试

/**
 * Test our function with various input
 */

var reverse, rutherford, shuffle, text, titleCase;

// set our test html string
text = "<h2>Header</h2><p>all the <span class=\"bright\">content</span> here</p>\nQuick<em>Draw</em>McGraw\n<em>going</em><i>home</i>";

// function used to reverse words
reverse = function(s) {
  return s.split('').reverse().join('');
};

// function used by rutherford to return a shuffled array
shuffle = function(a) {
  return a.sort(function() {
    return Math.round(Math.random()) - 0.5;
  });
};

// function used to shuffle the middle of words, leaving each end undisturbed
rutherford = function(inc) {
  var m = inc.match(/^(.?)(.*?)(.)$/);
  return m[1] + shuffle(m[2].split('')).join('') + m[3];
};

// function to make word Title Cased
titleCase = function(s) {
  return s.replace(/./, function(w) {
    return w.toUpperCase();
  });
};

console.log(manipulate(text, reverse));
console.log(manipulate(text, rutherford));
console.log(manipulate(text, titleCase));

仍然有一些怪癖，例如标题和段落文本没有被识别为单独的单词（因为它们位于单独的块级标签而不是内联标签中），但这基本上是我尝试做的方法的证明。

我还希望它能够处理实际添加和删除文本的字符串操作公式，而不是替换/移动它（操作后字符串长度可变），但这会打开一个全新的作品，我还没有准备好.

现在我在代码中添加了一些注释，并将其作为 javascript 的要点，我希望有人会改进它 - 特别是如果有人可以删除正则表达式部分并用更好的东西替换！

要点：https ://gist.github.com/3309906

演示：http: //jsfiddle.net/gh/gist/underscore/1/3309906/

（输出到控制台）

现在终于使用 HTML 解析器

(http://ejohn.org/files/htmlparser.js)

演示：http: //jsfiddle.net/EDJyU/

score 0 · Accepted Answer

您可以使用 setInterval 每隔 ** 次更改一次，例如：

 
const TITTLE = document.getElementById("Tittle") //Let's get the div
   
 setInterval(()=> { 
      let TITTLE2 = document.getElementById("rotate") //we get the element at the moment of execution
      let spanTittle = document.createElement("span"); // we create the new element "span"

      spanTittle.setAttribute("id","rotate");  // attribute to new element
      (TITTLE2.textContent == "TEXT1")       // We compare wich string is in the div
      ? spanTittle.appendChild(document.createTextNode(`TEXT2`)) 
      : spanTittle.appendChild(document.createTextNode(`TEXT1`))

      TITTLE.replaceChild(spanTittle,TITTLE2)   //finally, replace the old span for a new
    },2000)

<html>
<head></head>
<body>  
   <div id="Tittle">TEST YOUR <span id="rotate">TEXT1</span></div>
</body>
</html>

php - 在不更改 HTML 的情况下操作 HTML 字符串的内容

编辑

另一个编辑

另一个编辑

6 回答 6

方法

测试

要点：https ://gist.github.com/3309906

演示：http: //jsfiddle.net/gh/gist/underscore/1/3309906/

现在终于使用 HTML 解析器

Related

Reference