r - 用于可视化或过滤 P 值的 Tidy chisq.test 输出的函数

Question

对于数据...

library(productplots) 
library(ggmosaic)

对于代码...

 library(tidyverse)
 library(broom)

我正在尝试创建整洁的 chisq.test 输出，以便我可以轻松过滤或可视化 p 值。

我正在使用“快乐”数据集（包含在上面列出的任何一个包中）

对于这个例子，如果我想在所有其他变量上设置“快乐”变量，我会隔离分类变量（对于这个例子，我不会根据年龄、年份等创建因子分组），然后运行一个简单的函数。

df<-happy%>%select(-year,-age,-wtssall)
lapply(df,function(x)chisq.test(happy$happy,x)

但是，我想从“broom”包中得到一个整洁的输出，这样我就可以创建一个 p 值的数据框来过滤或可视化。

我已经尝试了类似于下面代码的各种组合，希望进一步管道进入“整洁”扫帚功能或进入“过滤器”，在那里我可以缩小重要的 p 值，或者管道进入 p-的 ggplot 条形图值或 chi 统计。

df%>%summarise_if(is.factor,funs(chisq.test(.,df$happy)$p.value))

...但输出似乎不正确。如果我针对变量单独运行 chisq.test，答案会有所不同。

那么，有没有一种方法可以轻松地比较分类变量，在这种情况下与所有其他列“快乐”，并返回一个整洁的数据框以进行进一步的操作和分析？

使用 dplyr::mutate、tidyr::nest 和 purrr::map 的 Purrr 解决方案会很棒，但我感觉嵌套列表列方法不适用于 chisq.test。

score 3 · Accepted Answer

您可以在工作流程中完成这一切，tidyverse使用. 除非您要对数据进行子集化以某种方式（例如年龄组）比较结果，否则没有必要maplapplynest

df <- happy%>%
  select(-id, -year,-age,-wtssall) %>% 
  map(~chisq.test(.x, happy$happy)) %>% 
  tibble(names = names(.), data = .) %>% 
  mutate(stats = map(data, tidy))

unnest(df, stats)

# A tibble: 6 × 6
    names        data   statistic       p.value parameter                     method
    <chr>      <list>       <dbl>         <dbl>     <int>                     <fctr>
1   happy <S3: htest> 92606.00000  0.000000e+00         4 Pearson's Chi-squared test
2     sex <S3: htest>    11.46604  3.237288e-03         2 Pearson's Chi-squared test
3 marital <S3: htest>  2695.18474  0.000000e+00         8 Pearson's Chi-squared test
4  degree <S3: htest>   659.33013 4.057952e-137         8 Pearson's Chi-squared test
5 finrela <S3: htest>  2374.24165  0.000000e+00         8 Pearson's Chi-squared test
6  health <S3: htest>  2928.62829  0.000000e+00         6 Pearson's Chi-squared test

r - 用于可视化或过滤 P 值的 Tidy chisq.test 输出的函数

1 回答 1

Related

Reference