r - 查找R中每列每个因子的数量

Question

我正在尝试编写代码，允许我在 R 中找到每列每个因子的数量，但我希望每列中的因子水平相同。我认为这应该是微不足道的，但我遇到了两个地方，当使用 apply with factor 和使用 apply with table 时，R 没有返回我期望的值。

考虑这个样本数据：

mat <- matrix(sample(1:10,90,replace=TRUE),ncol=10,nrow=9)
mat.levels <- as.character(unique(as.vector(mat)))
mat.factor <- as.data.frame(apply(mat,2,as.character))

我的第一步是重新调整每一列，使因子水平相同。起初我试过：

apply(mat.factor,2,factor,levels=mat.levels)
#But the data structure is all wrong, I don't appear to have a factor anymore!
str(apply(mat.factor,2,factor,levels=mat.levels))

所以我用循环蛮力强迫它......

for (i in 1:ncol(mat.factor)) {
      levels(mat.factor[,i]) <- mat.levels
    }

然后我遇到了另一个应用问题。我认为现在我已经设置了因子水平，如果我在列中缺少给定因子，则表函数应该为该因子水平返回 0 计数。然而，当我使用 apply 时，似乎零计数的因子水平被丢弃了！

apply(mat.factor,2,table)$V10
str(apply(mat.factor,2,table)$V10)
#But running table just on that one column yields the expected result!
table(mat.factor[,10])
str(table(mat.factor[,10]))

有人会解释在这两种情况下发生了什么吗？我在误解什么？

score 3 · Accepted Answer

阅读详细信息部分的第一句话，?apply然后运行as.matrix(mat.factor)以查看问题。用于lapply数据帧，而不是apply.

这是一个例子：

mat.factor <- as.data.frame(lapply(mat.factor,factor,levels = mat.levels))
lapply(mat.factor,table)

r - 查找R中每列每个因子的数量

1 回答 1

Related

Reference