10

如何使用 base 在第一个逗号上有效地拆分以下字符串?

x <- "I want to split here, though I don't want to split elsewhere, even here."
strsplit(x, ???)

期望的结果(2 个字符串):

[[1]]
[1] "I want to split here"   "though I don't want to split elsewhere, even here."

先感谢您。

编辑:没想到要提这个。这需要能够推广到一列,这样的字符串向量,如:

y <- c("Here's comma 1, and 2, see?", "Here's 2nd sting, like it, not a lot.")

结果可以是两列或一个长向量(我可以取每个其他元素)或每个索引([[n]])有两个字符串的字符串列表。

对缺乏明确性表示歉意。

4

5 回答 5

13

这就是我可能会做的。它可能看起来很老套,但由于sub()strsplit()都是矢量化的,因此在处理多个字符串时它也可以顺利工作。

XX <- "SoMeThInGrIdIcUlOuS"
strsplit(sub(",\\s*", XX, x), XX)
# [[1]]
# [1] "I want to split here"                               
# [2] "though I don't want to split elsewhere, even here."
于 2012-04-25T04:23:39.543 回答
9

stringr包装:

str_split_fixed(x, pattern = ', ', n = 2)
#      [,1]                  
# [1,] "I want to split here"
#      [,2]                                                
# [1,] "though I don't want to split elsewhere, even here."

(这是一个一行两列的矩阵。)

于 2012-04-25T04:40:30.797 回答
4

这是另一种解决方案,使用正则表达式来捕获第一个逗号之前和之后的内容。

x <- "I want to split here, though I don't want to split elsewhere, even here."
library(stringr)
str_match(x, "^(.*?),\\s*(.*)")[,-1] 
# [1] "I want to split here"                              
# [2] "though I don't want to split elsewhere, even here."
于 2012-04-25T13:52:02.503 回答
3

library(stringr)

str_sub(x,end = min(str_locate(string=x, ',')-1))

这将得到你想要的第一个位。改变start=end=获得str_sub你想要的任何东西。

如:

str_sub(x,start = min(str_locate(string=x, ',')+1 ))

并换str_trim行以摆脱领先空间:

str_trim(str_sub(x,start = min(str_locate(string=x, ',')+1 )))

于 2012-04-25T04:06:36.353 回答
2

这行得通,但我更喜欢 Josh Obrien:

y <- strsplit(x, ",")
sapply(y, function(x) data.frame(x= x[1], 
    z=paste(x[-1], collapse=",")), simplify=F))

受到追逐的回应的启发。

许多人给出了非基本方法,所以我想我会添加我通常使用的方法(尽管在这种情况下我需要一个基本响应):

y <- c("Here's comma 1, and 2, see?", "Here's 2nd sting, like it, not a lot.")
library(reshape2)
colsplit(y, ",", c("x","z"))
于 2012-04-25T04:30:02.870 回答