1

我想为我的分类变量的一个子集生成均值和频率。

mtcars2 <- mtcars %>% mutate(across(matches('cyl|gear|carb'), as.factor))

我知道我可以使用它来分别获得分类和连续的输出。

mtcars_out <- tbl_summary(mtcars2, 
                          statistic = list(all_numeric() ~ "{mean} ({sd})",
                                           all_categorical() ~ "{n} / {N} ({p}%)")) %>% as_tibble()

由于 mtacrs$cyl 已经具有与之关联的“级别”,因此我想按原样使用 mtcars2 并为该变量生成平均值。像这样的东西......但 tbl_summary 不喜欢这样,因为它是一个分类变量。

mtcars_out <- tbl_summary(mtcars2, 
                          statistic = list(all_numeric() ~ "{mean} ({sd})",
                                           "cyl"~"{mean} ({sd})")) %>% as_tibble()

Error: Problem with `mutate()` input `tbl_stats`.
x There was an error assembling the summary statistics for 'cyl'
  with summary type 'categorical'.

There are 2 common sources for this error.
1. You have requested summary statistics meant for continuous
   variables for a variable being as summarized as categorical.
   To change the summary type to continuous, add the argument
  `type = list(cyl ~ 'continuous')`
2. One of the functions or statistics from the `statistic=` argument is not valid.
i Input `tbl_stats` is `pmap(...)`.

我尝试在调用中指定类型,但这也不起作用。

mtcars_out <- tbl_summary(mtcars2, 
                          type = list("cyl"~"continuous"),
                          statistic = list(all_numeric() ~ "{mean} ({sd})",
                                           all_categorical() ~ "{n} / {N} ({p}%)")) %>% as_tibble()



 Error: Problem with `mutate()` input `summary_type`.
x Column 'cyl' is class "factor" and cannot be summarized as a continuous variable.
i Input `summary_type` is `assign_summary_type(...)`.

我的实际数据集有 500 个变量,并且已经为每个变量指定了类,所以我不想更改原始数据集的类类型。我想在 tbl_summary 调用中指定它。

任何帮助是极大的赞赏!!

4

1 回答 1

2

你已经做cyl了一个因子,R 不允许你取因子变量的平均值。

我认为对你来说最简单的事情是拥有变量和因子版本的数字版本。从那里你可以总结这两个变量。从那里,您可以删除额外的标题行(对于变量的因子版本)。

library(gtsummary)
library(tidyverse)

tbl <- 
  mtcars %>%
  select(cyl) %>%
  mutate(fct_cyl = factor(cyl)) %>%
  tbl_summary(
    type = where(is.numeric) ~ "continuous",
    statistic = where(is.numeric) ~ "{mean} ({sd})",
    label = cyl ~ "No. Cylinders"
  ) 

# remove extra header row for factor variables
tbl$table_body <-
  tbl$table_body %>%
  filter(!(startsWith(variable, "fct_") & row_type == "label"))

# print table
tbl

在此处输入图像描述

于 2020-10-09T22:22:04.767 回答