8

This may seem like a very basic R question, but I'd appreciate an answer. I have a data frame in the form of:

col1    col2
a   g
a   h
a   g
b   i
b   g
b   h
c   i

I want to transform it into counts, so the outcome would be like this. I've tried using table () function, but seem to only be able to get the count for one column.

    a   b   c
g   2   1   0
h   1   1   0
i   0   1   1

How do I do it in R?

4

2 回答 2

9

我不太确定你用的是什么,但table对我来说很好用!

这是一个最小的可重现示例:

df <- structure(list(V1 = c("a", "a", "a", "b", "b", "b", "c"), 
                     V2 = c("g", "h", "g", "i", "g", "h", "i")), 
                .Names = c("V1", "V2"), class = "data.frame", 
                row.names = c(NA, -7L))
table(df)
#    V2
# V1  g h i
#   a 2 1 0
#   b 1 1 1
#   c 0 0 1

笔记:

  • 尝试table(df[c(2, 1)])(或table(df$V2, df$V1))交换行和列。
  • 用于as.data.frame.matrix(table(df))获取 adata.frame作为您的输出。(as.data.frame将创建一个 long data.frame,而不是您想要的相同输出格式)。
于 2013-09-19T12:52:01.303 回答
4

从@Ananda使用f,您可以使用dcast

library(reshape2)

> dcast(f, V1~V2)
Using V2 as value column: use value.var to override.
Aggregation function missing: defaulting to length
  V1  g  h  i
1 a   2  1  0
2 b   1  1  1
3 c   0  0  1

但是,我写这篇文章只是为了以防您table将来可能需要更多的东西(对于这种情况,这是最简单的正确答案),例如:

set.seed(1)
f$var <- rnorm(7)

> f
  V1 V2        var
1 a   g -0.6264538
2 a   h  0.1836433
3 a   g -0.8356286
4 b   i  1.5952808
5 b   g  0.3295078
6 b   h -0.8204684
7 c   i  0.4874291

> dcast(f, V1~V2, value.var="var", fun.aggregate=sum)
  V1          g          h         i
1 a  -1.4620824  0.1836433 0.0000000
2 b   0.3295078 -0.8204684 1.5952808
3 c   0.0000000  0.0000000 0.4874291
于 2013-09-19T13:00:02.547 回答