1

找到了很多相关的链接,但在我想要的方面却一无所获。我想要一个正则表达式来匹配一个否定的打开和关闭标签。以这个字符串为例:

<p>This <em>is</em> <span>a</span> <b>sentence</b>.</p>

我使用正则表达式来匹配<em><b>而离开<p><span>独自一人。我使用以下正则表达式执行此操作:

<(?!p|span)[^>]*>

问题是,上面将匹配</p>and </span>。我也想留下那些结束标签。我试过了:

<(/)?(?!p|span)[^>]*>

以及它的不同组合,但我没有尝试过任何工作。希望我能得到一些帮助。如何在执行以下操作的情况下设置正则表达式来匹配这些:(<(?!p|span)[^>]*>(.*?)</(?!p|span)[^>]*>看起来很糟糕,可能需要更多资源)。

4

1 回答 1

3

试试这个:

(?:<(em|b)[^<>]*?>)([^<>]+)(?=</\1>)  

解释:

<!--
(?:<(em|b)[^<>]*?>)([^<>]+)(?=</\1>)

Options: case insensitive; ^ and $ match at line breaks

Match the regular expression below «(?:<(em|b)[^<>]*?>)»
   Match the character “&lt;” literally «<»
   Match the regular expression below and capture its match into backreference number 1 «(em|b)»
      Match either the regular expression below (attempting the next alternative only if this one fails) «em»
         Match the characters “em” literally «em»
      Or match regular expression number 2 below (the entire group fails if this one fails to match) «b»
         Match the character “b” literally «b»
   Match a single character NOT present in the list “&lt;>” «[^<>]*?»
      Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
   Match the character “&gt;” literally «>»
Match the regular expression below and capture its match into backreference number 2 «([^<>]+)»
   Match a single character NOT present in the list “&lt;>” «[^<>]+»
      Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Assert that the regex below can be matched, starting at this position (positive lookahead) «(?=</\1>)»
   Match the characters “&lt;/” literally «</»
   Match the same text as most recently matched by capturing group number 1 «\1»
   Match the character “&gt;” literally «>»
-->

此模式用于将整个标记数据与打开和关闭对匹配。

但如果您只想删除标签,可以使用:

</?(em|b)[^<>]*?>

于 2012-05-05T11:18:22.573 回答