我在 R 中使用正则表达式
regexp <- "(^|[^([:alnum:]|.|_)])abc@abc.de($|[^[:alnum:]])"
在特定文本中查找电子邮件地址abc@abc.de
并将其替换为anonym-mail-adress
.
tmp <- c("aaaaabc@abc.debbbb", ## <- should not be matched
"aaaa abc@abc.de bbbb", ## <- should be matched
"abc@abc.de", ## <- should be matched
"aaa.abc@abc.de", ## <- should not be matched
"aaaa_abc@abc.de", ## <- should not be matched
"(abc@abc.de)", ## <- should be matched
"aaaa (abc@abc.de) bbbb") ## <- should be matched
replacement <- paste("\\1", "anonym@anonym.de", "\\2", sep="")
gsub(regexp, replacement, tmp, ignore.case=TRUE)
结果我得到
> gsub(regexp, replacement, tmp, ignore.case=TRUE)
[1] "aaaaabc@abc.debbbb" "aaaa anonym@anonym.de bbbb"
[3] "anonym@anonym.de" "aaa.abc@abc.de"
[5] "aaaa_abc@abc.de" "(abc@abc.de)"
[7] "aaaa (abc.abc.de) bbbb"
不知道为什么数组的最后两个元素不匹配?
感谢你并致以真诚的问候。