r - 如何对这组数据（名义变量）应用费希尔检验

Question

我在统计方面很新：

fisher = function(idxToTest, idxATI){

idxDependent=c()
dependent=c()
p = c()

for(i in c(1:length(idxToTest)))
{
    tbl = table(data[[idxToTest[i]]], data[[idxATI]])
    rez = fisher.test(tbl, workspace = 20000000000)
    if(rez$p.value<0.1){
        dependent=c(dependent, TRUE)
        if(rez$p.value<0.1){
            idxDependent = c(idxDependent, idxToTest[i])
        }
    }
    else{
        dependent = c(dependent, FALSE)
    }
    p = c(p, rez$p.value)
}

}

这是我使用的功能。它似乎工作。

到目前为止我所理解的是我必须作为第一个参数数据传递，例如：

                Men    Women 
Dieting         10      30 
Non-dieting     5       60

我的数据来自 CSV：

data = read.csv('***.csv', header = TRUE, sep=',');

我的第一个问题是我不知道如何与之交谈：

Loan.Purpose   Home.Ownership
lp_value_1     ho_value_2
lp_value_1     ho_value_2
lp_value_2     ho_value_1
lp_value_3     ho_value_2
lp_value_2     ho_value_3
lp_value_4     ho_value_2
lp_value_3     ho_value_3

到：

              ho_value_1    ho_value_2    ho_value_3
lp_value1     0             2             0
lp_value2     1             0             1
lp_value3     0             1             1
lp_value4     0             1             0

第二个问题是我不知道第二个参数应该是什么

发布更新：这就是我使用的fisher.test(myTable)：

Error in fisher.test(test) : FEXACT error 501.
The hash table key cannot be computed because the largest key
is larger than the largest representable int.
The algorithm cannot proceed.
Reduce the workspace size or use another algorithm.

哪里myTable是：

           MORTGAGE NONE OTHER OWN RENT
  car                      18    0     0   5   27
  credit_card             190    0     2  38  214
  debt_consolidation      620    0     2  87  598
  educational               5    0     0   3    7
  ...

score 1 · Accepted Answer

基本上，fisher 测试仅适用于较小的数据集，因为它们需要大量内存。但一切都很好，因为卡方检验做了最少的额外假设并且在计算机上更容易。做就是了：

chisq.test(Loan.Purpose,Home.Ownership)

得到你的p值。

确保您通读并理解 chisq.test 的帮助页面，尤其是底部的示例。

http://stat.ethz.ch/R-manual/R-patched/library/stats/html/chisq.test.html

然后查看马赛克图以查看以下数量：

 mosaicplot(Loan.Purpose,Home.Ownership)

本参考资料解释了马赛克图的工作原理。

http://alumni.media.mit.edu/~tpminka/courses/36-350.2001/lectures/day12/

r - 如何对这组数据（名义变量）应用费希尔检验

1 回答 1

Related

Reference