0

I have the following data and I was wondering how to generate a table of the frequency from each response via base, plyr, or another package.

My data:

df = data.frame(id = c(1,2,3,4,5),
                Did_you_use_tv=c("tv","","","tv","tv"),
                Did_you_use_internet=c("","","","int","int"))
df

I can run a table and get the frequencies for any column using the table

table(df[,2])
table(df[,2], df[,3])

However, how can I go about setting up the data so it looks like below.

df2 = data.frame(Did_you_use_tv=c(3), 
                Did_you_use_internet=c(2))
df2

It's just a summary of frequencies for each column.

I'm going to be creating cross tabs but given the structure of the data, I feel this may be a little more useful.

4

4 回答 4

2

这在概念上与@Tyler 的答案相似。只需取所有不等于 的值的总和""

colSums(!df[-1] == "")
#       Did_you_use_tv Did_you_use_internet 
#                    3                    2 

更新

Stack Overflow 用户@juba对一个名为的函数multi.table做了一些工作,如下所示:

multi.table <- function(df, true.codes=NULL, weights=NULL) {
  true.codes <- c(as.list(true.codes), TRUE, 1)
  as.table(sapply(df, function(v) {
    sel <- as.numeric(v %in% true.codes)
    if (!is.null(weights)) sel <- sel * weights
    sum(sel)
  }))
}

该函数是questionr的一部分。

您的示例中的用法是:

library(questionr)
multi.table(df[-1], true.codes=list("tv", "int"))
#       Did_you_use_tv Did_you_use_internet 
#                    3                    2 
于 2013-11-06T15:58:10.707 回答
1

这是想到的许多方法中的一种:

FUN <- function(x) sum(x != "")
do.call(cbind, lapply(df[, -1], FUN))

##     Did_you_use_tv Did_you_use_internet
## [1,]              3                    2
于 2013-11-06T15:53:08.677 回答
1

这是另一种方法

> do.call(cbind, lapply(df[,-1], table))[-1, ]
      Did_you_use_tv Did_you_use_internet 
                   3                    2 
于 2013-11-06T15:58:17.107 回答
0

plyr和_reshape2

t(dcast(subset(melt(df,id.var="id"), value!=""), variable ~ .))
于 2013-11-06T17:06:53.850 回答