2

可能重复:
正则表达式匹配打开的标签,XHTML 自包含标签除外

如何匹配HTML 标记之外的一些字母数字单词,而不是匹配每个单词

例子:

<div id="mariano mariano mariano" nota="mariano/mariano">mariano was looking forward Mariano. I want to match this "Mariano" too. Mariano</div>

在此示例中,我想匹配标签 ID 之外的所有“Mariano”。

我认为这个问题的关键是在“<”之前期待“<”并匹配该单词,但如果正则表达式在“<”之前找到“>”,这意味着该单词在标签中,但我无法为此实现/生成正则表达式。

我无法尝试将此正则表达式(?<=^|>)[^><]+?(?=<|$)与另一个正则表达式连接起来。我最终的最低质量解决方案是:

<!-- language: lang-js -->
var searchFor = new RegExp("((!?<=^|>)" + termino + ")","ig");
var searchFor2 = new RegExp("(" + termino + "(?=<|$))","ig");
var searchFor3 = new RegExp("(!?<=^|[\\s\\.;,])" + termino + "(?=[\\s\\.;,]|$)","ig");

但是这 3 个并没有涵盖所有的选择。

编辑:我正在使用 javascript:

<script>
container.find("p, span, div, .texto,").each(function() {
var containerText = $(this).html();
for (var i = 0; i < terms.length; i++) {
    var termino = terms[i];
    // 1st issue ">termino" was remplaced for: ">Pedro"
    var searchFor = new RegExp("((!?<=^|>)" + termino + ")","ig");
    containerText = containerText.replace(searchFor,">Pedroedro");
    // 2nd issue "termino<" was remplaced for: "Pedro"
    var searchFor2 = new RegExp("(" + termino + "(?=<|$))","ig");
    containerText = containerText.replace(searchFor2,"Pedro");
    // 3rd issue "[\.\s,;:]termino[\.\s,;:]
    var searchFor3 = new RegExp("(!?<=^|[\\s\\.;,])" + termino + "(?=[\\s        \\.;,]|$)","ig");
    containerText = containerText.replace(searchFor3," Pedro");
};
$(this).html(containerText);
}); 
</script>
4

1 回答 1

1

一些东西 -

  1. 欢迎来到堆栈溢出!
  2. 请在提问前搜索问题。使用正则表达式解析 xml 有很多结果。
  3. 不要使用正则表达式来解析 xml/html! 试试 xpath

    var termino = // how ever you were defining before...
    
    // Give me all divs, where the text content contains value of "termino"
    var iterator = document.evaluate('//div/text()[contains(.,' + termino + ')]', documentNode, null, XPathResult.UNORDERED_NODE_ITERATOR_TYPE, null );
    
    try {
      // init thisNode to the first item in the iterator
      var thisNode = iterator.iterateNext();
    
      // go through all items, alert their content (which should contain termino)
      while (thisNode) {
        alert( thisNode.textContent );
        thisNode = iterator.iterateNext();
      } 
    }
    catch (e) {
       dump( 'Error: Document tree modified during iteration ' + e );
    }
    
于 2012-09-19T22:00:47.390 回答