r - 在 str_replace / stri_replace 中使用捕获的组 - stringi vs stringr

Question

大多数stringr函数只是相应stringi函数的包装。str_replace_all就是其中之一。然而我的代码不适stri_replace_all用于相应的stringi函数。

我正在编写一个快速的正则表达式来将驼峰大小写（的一个子集）转换为间隔单词。

我很困惑为什么会这样：

str <- "thisIsCamelCase aintIt"
stringr::str_replace_all(str, 
                         pattern="(?<=[a-z])([A-Z])", 
                         replacement=" \\1")
# "this Is Camel Case ain't It"

这不会：

stri_replace_all(str, 
                 regex="(?<=[a-z])([A-Z])", 
                 replacement=" \\1")
# "this 1s 1amel 1ase ain't 1t"

score 9 · Accepted Answer

如果您查看源代码，stringr::str_replace_all您会看到它调用fix_replacement(replacement)将\\#捕获组引用转换为$#. 但帮助stringi:: stri_replace_all也清楚地表明您使用$1,$2等作为捕获组。

str <- "thisIsCamelCase aintIt"
stri_replace_all(str, regex="(?<=[a-z])([A-Z])", replacement=" $1")
## [1] "this Is Camel Case aint It"

score 1 · Accepted Answer

以下选项在两种情况下都应返回相同的输出。

pat <- "(?<=[a-z])(?=[A-Z])"
str_replace_all(str, pat, " ")
#[1] "this Is Camel Case aint It"
stri_replace_all(str, regex=pat, " ")
#[1] "this Is Camel Case aint It"

根据帮助页面?stri_replace_all，有建议的例子，$1用于$2替换

stri_replace_all_regex('123|456|789', '(\\p{N}).(\\p{N})', '$2-$1')

所以，如果我们\\1用$1

stri_replace_all(str, regex = "(?<=[a-z])([A-Z])", " $1")
#[1] "this Is Camel Case aint It"

r - 在 str_replace / stri_replace 中使用捕获的组 - stringi vs stringr

2 回答 2

Related

Reference