regex - 单词边界末端和边缘之间的正则表达式差异

Question

符号 \< 和 \> 分别匹配单词开头和结尾的空字符串。符号 \b 匹配单词边缘的空字符串

结尾和边缘（单词）有什么区别？

score 3 · Accepted Answer

\b和\</之间的区别在于\>可以\b在 PCRE 正则表达式模式（当您指定时perl=TRUE）和 ICU 正则表达式模式（stringr包）中使用。

> s = "no where nowhere"
> sub("\\<no\\>", "", s)
[1] " where nowhere"
> sub("\\<no\\>", "", s, perl=T) ## \> and \< do not work with PCRE
[1] "no where nowhere"
> sub("\\bno\\b", "", s, perl=T) ## \b works with PCRE
[1] " where nowhere"

> library(stringr)
> str_replace(s, "\\bno\\b", "")
[1] " where nowhere"
> str_replace(s, "\\<no\\>", "")
[1] "no where nowhere"

\<（总是代表词的开头）和（总是匹配词的结尾）的优点\>是它们是明确的。\b可能匹配两个位置。

还要考虑一件事（参考）：

gsub 和 gregexpr 的 POSIX 1003.2 模式不能与重复的词边界（例如，pattern = "\b"）一起正常工作。用于perl = TRUE此类匹配（但对于非 ASCII 输入可能无法按预期工作，因为“单词”的含义取决于系统）。

regex - 单词边界末端和边缘之间的正则表达式差异

1 回答 1

Related

Reference