对我对这个问题的回答的评论应该给出预期的结果strsplit
,即使它似乎正确匹配字符向量中的第一个和最后一个逗号。这可以使用gregexpr
和来证明regmatches
。
那么为什么strsplit
在这个例子中对每个逗号进行拆分,即使只为同一个regmatches
正则表达式返回两个匹配项?
# We would like to split on the first comma and
# the last comma (positions 4 and 13 in this string)
x <- "123,34,56,78,90"
# Splits on every comma. Must be wrong.
strsplit( x , '^\\w+\\K,|,(?=\\w+$)' , perl = TRUE )[[1]]
#[1] "123" "34" "56" "78" "90"
# Ok. Let's check the positions of matches for this regex
m <- gregexpr( '^\\w+\\K,|,(?=\\w+$)' , x , perl = TRUE )
# Matching positions are at
unlist(m)
[1] 4 13
# And extracting them...
regmatches( x , m )
[[1]]
[1] "," ","
咦?!到底是怎么回事?