regex - 拆分字符串中的第一个逗号

Question

如何使用 base 在第一个逗号上有效地拆分以下字符串？

x <- "I want to split here, though I don't want to split elsewhere, even here."
strsplit(x, ???)

期望的结果（2 个字符串）：

[[1]]
[1] "I want to split here"   "though I don't want to split elsewhere, even here."

先感谢您。

编辑：没想到要提这个。这需要能够推广到一列，这样的字符串向量，如：

y <- c("Here's comma 1, and 2, see?", "Here's 2nd sting, like it, not a lot.")

结果可以是两列或一个长向量（我可以取每个其他元素）或每个索引（[[n]]）有两个字符串的字符串列表。

对缺乏明确性表示歉意。

score 13 · Accepted Answer

这就是我可能会做的。它可能看起来很老套，但由于sub()和strsplit()都是矢量化的，因此在处理多个字符串时它也可以顺利工作。

XX <- "SoMeThInGrIdIcUlOuS"
strsplit(sub(",\\s*", XX, x), XX)
# [[1]]
# [1] "I want to split here"                               
# [2] "though I don't want to split elsewhere, even here."

score 9 · Accepted Answer

从stringr包装：

str_split_fixed(x, pattern = ', ', n = 2)
#      [,1]                  
# [1,] "I want to split here"
#      [,2]                                                
# [1,] "though I don't want to split elsewhere, even here."

（这是一个一行两列的矩阵。）

score 4 · Accepted Answer

这是另一种解决方案，使用正则表达式来捕获第一个逗号之前和之后的内容。

x <- "I want to split here, though I don't want to split elsewhere, even here."
library(stringr)
str_match(x, "^(.*?),\\s*(.*)")[,-1] 
# [1] "I want to split here"                              
# [2] "though I don't want to split elsewhere, even here."

score 3 · Accepted Answer

library(stringr)

str_sub(x,end = min(str_locate(string=x, ',')-1))

这将得到你想要的第一个位。改变start=和end=获得str_sub你想要的任何东西。

如：

str_sub(x,start = min(str_locate(string=x, ',')+1 ))

并换str_trim行以摆脱领先空间：

str_trim(str_sub(x,start = min(str_locate(string=x, ',')+1 )))

score 2 · Accepted Answer

这行得通，但我更喜欢 Josh Obrien：

y <- strsplit(x, ",")
sapply(y, function(x) data.frame(x= x[1], 
    z=paste(x[-1], collapse=",")), simplify=F))

受到追逐的回应的启发。

许多人给出了非基本方法，所以我想我会添加我通常使用的方法（尽管在这种情况下我需要一个基本响应）：

y <- c("Here's comma 1, and 2, see?", "Here's 2nd sting, like it, not a lot.")
library(reshape2)
colsplit(y, ",", c("x","z"))

regex - 拆分字符串中的第一个逗号

5 回答 5

Related

Reference