2

我有一个这样的数据框:

df = data.frame (Gender = c ("F", "M", "M", "F"),
  cat_age = c ("] 10-15]", "] 10, 15]", "] 20 -25] ","] 55-60] "), 
  frequency = c (2, 6, 8, 7))

我想把它改成这样:

F; M; cat_age
2; 6; ] 10, 15]
0; 8; ] 20, 25]
7; 0; ] 55, 60]
4

2 回答 2

1

您的 data.frame 有一些奇怪的地方,如果"] 10-15]"并且"] 10, 15]"应该是同一类别,您需要在 data.frame 中进行设置。例如:

df = data.frame (Gender = c ("F", "M", "M", "F"), 
cat_age = c ("] 10-15]", "] 10-15]", "] 20 -25] ","] 55-60] "), frequency = c (2, 6, 8, 7))

然后你可以使用pivot_wider()from tidyr

library(tidyr)

pivot_wider(df,values_from=frequency,names_from=Gender,values_fill=0)
# A tibble: 3 x 3
  cat_age          F     M
  <fct>        <dbl> <dbl>
1 "] 10-15]"       2     6
2 "] 20 -25] "     0     8
3 "] 55-60] "      7     0
于 2020-09-05T19:34:10.130 回答
0

这是使用的基本 R 选项reshape

dfout <- reshape(
  transform(df,
    cat_age = sapply(
      regmatches(cat_age, gregexpr("\\d+", cat_age)),
      function(x) paste0("]", paste0(x, collapse = ","), "]")
    )
  ),
  direction = "wide",
  idvar = "cat_age",
  timevar = "Gender"
)

这使

> dfout
  cat_age frequency.F frequency.M
1 ]10,15]           2           6
3 ]20,25]          NA           8
4 ]55,60]           7          NA

如果要替换NA0,可以再添加一行

replace(df,is.na(df),0)

这样

> replace(dfout,is.na(dfout),0)
  cat_age frequency.F frequency.M
1 ]10,15]           2           6
3 ]20,25]           0           8
4 ]55,60]           7           0
于 2020-09-05T22:41:53.807 回答