描述
我认为我解决这个问题的方法是将所有坏事与所有好事相匹配。然后在表达式中只放你的正则表达式。稍后在编程逻辑中,我将测试每个匹配项以查看是否填充了捕获组 1,如果是,则 match.index 将显示匹配发生在字符串中的位置。
这个正则表达式将:
- 匹配纯文本 url,从而防止它们在第 1 组中被捕获
- 匹配所有 html 标记的内容,从而防止它们在第 1 组中被捕获
- 将匹配所需的
/
括号文本,如/match me/
捕获组 1
https?:\/\/[^\s]*|<\/?\w+\b(?=\s|>)(?:='[^']*'|="[^"]*"|=[^'"][^\s>]*|[^>])*>|(\/[^(\/|\<|\>)]*[^\/]*\/)
例子
示例文本
I am attempting to replace values in /slashes/ with italic tags. One problem is HTML: If I do <b>html</b> <b>tags</b> tags, it picks up the closures. Also, the mark up allows URLs to be placed into [http://www.google.com/s] square bracket tags, messing things up further. Now the tags are off balanced. What /do/ I do? I'd ideally like to have it skip searching [] and inside <> tags. Doing <b>/italic/</b> should be legal, however.
火柴
[0] => Array
(
[0] => /slashes/
[1] => <b>
[2] => </b>
[3] => <b>
[4] => </b>
[5] => http://www.google.com/s]
[6] => /do/
[7] => <b>
[8] => /italic/
[9] => </b>
)
[1] => Array
(
[0] => /slashes/
[1] =>
[2] =>
[3] =>
[4] =>
[5] =>
[6] => /do/
[7] =>
[8] => /italic/
[9] =>
)