1

让表格如下:

v1 v2 v3
一个 一个
一个
一个 C
D C D

我希望 R 为每列的唯一值出现次数创建一个表:

v1 v2 v3
一个 1 1
1 2
C 0 1
D 1 0
4

5 回答 5

3

table像这样试试

> table(unlist(df),names(df)[col(df)])

    V1 v2 v3
  A  1  1  2
  B  1  2  0
  C  0  1  1
  D  1  0  1

数据

> dput(df)
structure(list(V1 = c("A", "B", NA, "D"), v2 = c("B", "B", "A", 
"C"), v3 = c("A", "A", "C", "D")), class = "data.frame", row.names = c(NA,
-4L))
于 2021-08-04T14:01:34.790 回答
2

一种选择可能是:

sapply(df, function(x) table(factor(x, levels = unique(unlist(df)))))

  V1 v2 v3
A  1  1  2
B  1  2  0
D  1  0  1
C  0  1  1
于 2021-08-04T13:57:52.087 回答
1

我们可以用mtabulate

library(qdapTools)
 t(mtabulate(df))
  V1 v2 v3
A  1  1  2
B  1  2  0
C  0  1  1
D  1  0  1

数据

df <- structure(list(V1 = c("A", "B", NA, "D"), v2 = c("B", "B", "A", 
"C"), v3 = c("A", "A", "C", "D")), class = "data.frame", row.names = c(NA,
-4L))
于 2021-08-04T15:36:19.590 回答
1

为了完整起见,这是一种使用melt()和组合的方法dcast()

library(data.table)
dcast(melt(setDT(df1), measure.vars = patterns("^v"))[value != ""], value ~ variable)
   value v1 v2 v3
1:     A  1  1  2
2:     B  1  2  0
3:     C  0  1  1
4:     D  1  0  1

该方法类似于Limey 的答案,将数据从宽变长再变回宽但不那么冗长。

编辑

代替dcast(),table()可以在从宽变长后调用:

melt(setDT(df1), measure.vars = patterns("^v"))[value != ""][
  , table(value, variable)]
     variable
value v1 v2 v3
    A  1  1  2
    B  1  2  0
    C  0  1  1
    D  1  0  1

请注意,此处使用了data.table 链接

并且,为了节省一些击键:

melt(setDT(df1), measure.vars = names(df1))[value != ""][, table(rev(.SD))]

数据

df1 <- fread("
|v1|v2|v3|
|A |B | A|
|B |B | A|
|  |A | C|
|D |C | D|", 
drop = c(1,5), header = TRUE)
于 2021-08-04T14:52:38.157 回答
1

要添加到集合中,请使用 tidyverse 版本。

library(tidyverse)

df %>% 
  pivot_longer(
    everything(), 
    values_to="Value", 
    names_to="Variable"
  ) %>% 
  group_by(Variable, Value) %>% 
  summarise(N=n(), .groups="drop") %>% 
  filter(!is.na(Value)) %>% 
  pivot_wider(values_from=N, names_from=Variable, values_fill=0) %>% 
  arrange(Value)
# A tibble: 4 x 4
  Value    v1    v2    v3
  <chr> <int> <int> <int>
1 A         1     1     2
2 B         1     2     0
3 C         0     1     1
4 D         1     0     1
于 2021-08-04T14:05:44.763 回答