3

In a string

string="aaaaaaaaaSTARTbbbbbbbbbbSTOPccccccccSTARTddddddddddSTOPeeeeeee"

I would like to remove all parts that occur between START and STOP, yielding

"aaaaaaaaacccccccceeeeeee"

if I try with gsub("START(.*)STOP","",string) this gives me "aaaaaaaaaeeeeeee" though.

What would be the correct way to do this, allowing for multiple occurrences of START and STOP?

4

2 回答 2

3

在里面也加一个?

gsub("START.*?STOP", "", string)
# [1] "aaaaaaaaacccccccceeeeeee"
于 2014-03-02T16:35:02.433 回答
0

不像阿南达的回答那么优雅,但是还有其他一些使用 stringr 和 plyr 包的方法。

library(stringr)
library(plyr)

start <- ldply(str_locate_all(string, 'START'))[1, 1]
end <- ldply(str_locate_all(string, 'STOP'))
end <- end[nrow(end), 2]
expression <- str_sub(string, start, end)
str_replace(string, expression, '')
于 2014-03-03T02:23:24.137 回答