r - 在 R 中重新格式化分类数据

Question

我有一个分类数据集，我试图总结它在所问问题的性质上存在固有差异。下面的数据代表一份问卷，其中包含标准的封闭式问题，但也包含可以从列表中选择多个答案的问题。“村庄”和“收入”代表封闭式问题。“responsible.1”...等...代表一个列表，其中受访者对每个人都说是或否。

VILLAGE  INCOME         responsible.1   responsible.2   responsible.3   responsible.4   responsible.5
   j     both           DLNR             NA              DEQ              NA           Public
   k     regular.income DLNR             NA              NA               NA           NA
   k     regular.income DLNR             CRM             DEQ              Mayor        NA
   l     both           DLNR             NA              NA               Mayor        NA
   j     both           DLNR             CRM             NA               Mayor        NA
   m     regular.income DLNR             NA              NA               NA           Public

我想要的是一个 3 路表输出，其中包含“村庄”和一套“负责任”负责变量包装成ftable. 这样，我可以使用带有大量 R 包的表格进行图表和分析。

        RESPONSIBLE             
VILLAGE INCOME          responsible.1   responsible.2   responsible.3   responsible.4   responsible.5
j       both            2               1               1               1               1
k       regular income  2               1               1               1               0
l       both            1               0               0               1               0
m       regular income  1               0               0               0               1

as.data.frame(table(village, responsible.1)会让我成为第一个，但我不知道如何将整个事情包裹在一个不错的ftable.

score 1 · Accepted Answer

> aggregate(dat[-(1:2)], dat[1:2], function(x) sum(!is.na(x)) )
  VILLAGE         INCOME responsible.1 responsible.2 responsible.3 responsible.4 responsible.5
1       j           both             2             1             1             1             1
2       l           both             1             0             0             1             0
3       k regular.income             2             1             1             1             0
4       m regular.income             1             0             0             0             1

我猜你实际上有另一个分组向量，也许是第一个“负责”列？

我不太了解排序规则，但颠倒分组列的顺序可能更接近您发布的内容：

> aggregate(dat[-(1:2)], dat[2:1], function(x) sum(!is.na(x)) )
          INCOME VILLAGE responsible.1 responsible.2 responsible.3 responsible.4 responsible.5
1           both       j             2             1             1             1             1
2 regular.income       k             2             1             1             1             0
3           both       l             1             0             0             1             0
4 regular.income       m             1             0             0             0             1

r - 在 R 中重新格式化分类数据

1 回答 1

Related

Reference