我将三个文本文档存储为名为“dlist”的列表列表:
dlist <- structure(list(name = c("a", "b", "c"), text = list(c("the", "quick", "brown"), c("fox", "jumps", "over", "the"), c("lazy", "dog"))), .Names = c("name", "text"))
在我的脑海中,我发现像这样描绘 dlist 很有帮助:
name text
1 a c("the", "quick", "brown")
2 b c("fox", "jumps", "over", "the")
3 c c("lazy", "dog")
这怎么能像下面这样被操纵?这个想法是绘制它,所以可以为 ggplot2 融化的东西会很好。
name text
1 a the
2 a quick
3 a brown
4 b fox
5 b jumps
6 b over
7 b the
8 c lazy
9 c dog
这是每个单词一行,给出单词及其父文档。
我努力了:
> expand.grid(dlist)
name text
1 a the, quick, brown
2 b the, quick, brown
3 c the, quick, brown
4 a fox, jumps, over, the
5 b fox, jumps, over, the
6 c fox, jumps, over, the
7 a lazy, dog
8 b lazy, dog
9 c lazy, dog
> sapply(seq(1,3), function(x) (expand.grid(dlist$name[[x]], dlist$text[[x]])))
[,1] [,2] [,3]
Var1 factor,3 factor,4 factor,2
Var2 factor,3 factor,4 factor,2
unlist(dlist)
name1 name2 name3 text1 text2 text3 text4
"a" "b" "c" "the" "quick" "brown" "fox"
text5 text6 text7 text8 text9
"jumps" "over" "the" "lazy" "dog"
> sapply(seq(1,3), function(x) (cbind(dlist$name[[x]], dlist$text[[x]])))
[[1]]
[,1] [,2]
[1,] "a" "the"
[2,] "a" "quick"
[3,] "a" "brown"
[[2]]
[,1] [,2]
[1,] "b" "fox"
[2,] "b" "jumps"
[3,] "b" "over"
[4,] "b" "the"
[[3]]
[,1] [,2]
[1,] "c" "lazy"
[2,] "c" "dog"
公平地说,我被各种 apply 和 plyr 函数弄糊涂了,不知道从哪里开始。我从未见过像上面的“sapply”尝试那样的结果,也不明白。