r - 连接来自R中不同行的字符串

Question

我有一个看起来像的 R 数据框

data.1       data.character
a            **str1**,str2,str2,str3,str4,str5,str6
b            str3,str4,str5
c            **str1**,str6

我目前正在使用grepl来识别列 data.character 是否有我的搜索字符串"<str>"，如果有，我希望将所有行值data.1连接成一个带分隔符的字符串

例如。如果我使用grepl(str1,data.character)它将返回两行df$data.1并且我想要一个输出

a,c （在 data.character 中包含 str1 的行）

我目前正在使用两个 for 循环，但我知道这不是一种有效的方法。我想知道是否有人可以提出一种更优雅、更省时的方法。

score 2 · Accepted Answer

你快到了——（现在我啰嗦的回答）

# Data
df <- read.table(text="data.1       data.character
       a            **str1**,str2,str2,str3,str4,str5,str6
       b            str3,str4,str5
       c            **str1**,str6",header=T,stringsAsFactors=F)

匹配字符串

# In your question you used grepl which produces a logical vector (TRUE if
#string is present)

grepl("str1" , df$data.character)
#[1]  TRUE FALSE  TRUE

# In my comment I used grep which produces an positional index of the vector if
# string is present (this was due to me not reading your grepl properly rather 
# than because of any property)

grep("str1" , df$data.character)
# [1] 1 3

然后在 grep（或 grepl）产生的这些位置处对您想要的向量进行子集化

(s <- df$data.1[grepl("str1" , df$data.character)])
# [1] "a" "c"  first and third elements are selected

将这些粘贴到所需的格式中（折叠参数用于定义元素之间的分隔符）

paste(s,collapse=",")
# [1] "a,c"

所以更简洁

paste(df$data.1[grep("str1" , df$data.character)],collapse=",")

r - 连接来自R中不同行的字符串

1 回答 1

Related

Reference