我有一个字符串向量,每个字符串都是 id 的 csv 列表。我想将每个字符串拆分为一个列表,并将长度和 id 集存储为数据框中的两个新列。这是一个例子:
df = data.frame(ids = c("a,b,c", "d", "e", "", "f,g", "", "h", "i", ""), stringsAsFactors=FALSE)
ids = sapply(df$ids, function (s) unlist(strsplit(as.character(s), ",")))
df$num.ids = sapply(ids, length)
df$ids.vec = sapply(ids, unlist)
到目前为止,这看起来不错:
> df
ids num.ids ids.vec
1 a,b,c 3 a, b, c
2 d 1 d
3 e 1 e
4 0
5 f,g 2 f, g
6 0
7 h 1 h
8 i 1 i
9 0
但是当我输入 summary(df) 时,我得到了 ids.vec 的神秘列。更重要的是,summary 不会计算摘要,而是列出每一行(当我将它应用于我的真实数据集时,这是一个问题)。
> summary(df)
ids num.ids ids.vec.Length ids.vec.Class ids.vec.Mode
Length:9 Min. :0 3 -none- character
Class :character 1st Qu.:0 1 -none- character
Mode :character Median :1 1 -none- character
Mean :1 0 -none- character
3rd Qu.:1 2 -none- character
Max. :3 0 -none- character
1 -none- character
1 -none- character
0 -none- character
任何想法我做错了什么?
谢谢!凯文