1

我对 R 很陌生,来自 Stata。下面是具有可重现数据示例的 r 降价块。数据代表我正在使用的数据。但只有在数量上有更多的二元(逻辑)和因子变量。

库和数据:

# Setup and load package:
library(dplyr)
library(expss)
library(hablar)
library(kableExtra)
library(summarytools)

# Load data:
data("mtcars")
raw_df <- select(mtcars,c(wt,cyl,gear,vs,am))

# Data prep and labelling:
df <- raw_df %>%
  apply_labels(wt = "Facility ID",
               cyl = "Geographical Area",
               cyl = c("Area A" = 4,"Area B" = 6, "Area C" = 8),
               gear = "Tier",
               gear = c("Tier 1" = 3, "Tier 2" = 4, "Tier 3" = 5),
               vs = "E.coli",
               am = "V.choleri") %>%
  convert(chr(wt),
          fct(cyl,gear),
          lgl(vs,am))

请注意,在我的实际数据中,有更多的分类和逻辑变量。我设法在 r markdown(html 输出)中制作了下表:


df %>%
  tab_cells(cyl, gear) %>%
  tab_total_row_position("below") %>%
  tab_total_statistic("u_rpct")%>%
  tab_total_label("Total hosts (Row proportions)") %>% 
  tab_cols(vs, am) %>% 
  tab_stat_rpct() %>% 
  tab_cols(total(label = "Number of hosts")) %>%  
  tab_stat_cases() %>%
  tab_pivot(stat_position = "outside_columns") %>%
  recode(as.criterion(is.numeric) & is.na ~ 0, TRUE ~ copy) %>% 
  split_table_to_df() %>% 
  kable(align = "c", digits = 1) %>% 
  kable_styling(bootstrap_options = c("striped", "condensed", "responsive"),
                full_width = F, position = "center") %>% 
  row_spec(1:2, bold = TRUE)

问题: 1. 我希望我只能包含“TRUE”列,从表中删除“FALSE”列。但保持第一行标签完整(“E. coli”、“V.choleri”)。实际上我不需要第二行(“TRUE”,“FALSE)2。我已经标记了“总行比例”(#Total hosts),但不能删除前导的“#”符号。在最右边具有“总行比例”的行的列单元格,它显示“100”。我尝试将其作为列单元格的总和,但失败了。“100”完全误导。3.我也试图让我的通过“summarytools”包的“ctable”功能获得所需的表格。由于它具有出色的结构,因此在比例单元格内也可以引入许多观察值。:

print(ctable(df$cyl,df$am), method = 'render')

但问题是它似乎只允许一对分类变量。而且,“FALSE”不能省略。但最后一列与 rowtotals 完美(观察)

详细信息:R:4.0.0 R studio:1.2.5042 软件包都是最新的。

4

1 回答 1

1

来自的表格expss是通常的data.frames。列标签只是列名,行用“|”分隔 象征。因此,您可以像通常的列名一样操作它们。行标签位于列中row_labels,我们可以通过搜索和替换操作删除“#”符号。“总行比例”显示“100”,因为在开始时您将总统计数据指定为行百分比,单列的行百分比为 100。考虑到以上所有因素:

library(dplyr)
library(expss)
library(hablar)
library(kableExtra)
library(summarytools)

# Load data:
data("mtcars")
raw_df <- select(mtcars,c(wt,cyl,gear,vs,am))

# Data prep and labelling:
df <- raw_df %>%
    apply_labels(wt = "Facility ID",
                 cyl = "Geographical Area",
                 cyl = c("Area A" = 4,"Area B" = 6, "Area C" = 8),
                 gear = "Tier",
                 gear = c("Tier 1" = 3, "Tier 2" = 4, "Tier 3" = 5),
                 vs = "E.coli",
                 am = "V.choleri") %>%
    convert(chr(wt),
            fct(cyl,gear),
            lgl(vs,am))


tbl = df %>%
    tab_cells(cyl, gear) %>%
    tab_total_row_position("below") %>%
    tab_total_statistic("u_rpct")%>%
    tab_total_label("Total hosts (Row proportions)") %>% 
    tab_cols(vs, am) %>% 
    tab_stat_rpct() %>% 
    tab_cols(total(label = "Number of hosts")) %>%  
    # specify total statistic for last column
    tab_stat_cases(total_statistic = "u_cases") %>%
    tab_pivot(stat_position = "outside_columns") %>%
    recode(as.criterion(is.numeric) & is.na ~ 0, TRUE ~ copy) %>% 
    # remove columns with FALSE
    except(contains("FALSE")) %>% 
    compute(
        # remove '#' sign from row labels
        row_labels = gsub("#", "", row_labels)
    )

# remove '#' sign from column labels
colnames(tbl) = gsub("\\|TRUE", "", colnames(tbl))

tbl %>% 
    split_table_to_df() %>% 
    kable(align = "c", digits = 1) %>% 
    kable_styling(bootstrap_options = c("striped", "condensed", "responsive"),
                  full_width = F, position = "center") %>% 
    row_spec(1:2, bold = TRUE)
于 2020-05-03T20:39:23.317 回答