我有一个数据框,我想为其计算卡方和 p 值。但是,当我打印出预期值时,它们并不是我所期望的。我期望代码测试的零假设是 Q7 不依赖于“ConcernImprovement”,因此我预计每个 Q7 条目的“预期频率”减少、增加和没有变化是相同的
这是我观察到的数据框,称为LikelihoodConcern
:
ConcernImprovement Decrease Increase No change
Q7
Likely 2.0 18.0 21.0
Not likely at all 0.0 2.0 1.0
Not very likely 3.0 11.0 5.0
Somewhat likely 4.0 24.0 14.0
Very likely 1.0 16.0 8.0
我试过这段代码:
from scipy.stats import chi2_contingency
chi2, p, dof, expected = chi2_contingency(LikelihoodConcern, correction=False)
expected
它为预期的频率返回这个:
array([[ 3.15384615, 22.39230769, 15.45384615],
[ 0.23076923, 1.63846154, 1.13076923],
[ 1.46153846, 10.37692308, 7.16153846],
[ 3.23076923, 22.93846154, 15.83076923],
[ 1.92307692, 13.65384615, 9.42307692]])
我希望它会返回:
array([[ 13.67777777, 13.67777777, 13.67777777],
[ 1.00000000, 1.00000000, 1.00000000],
[ 6.33333333, 6.33333333, 6.33333333],
[ 14.00000000, 14.00000000, 14.00000000],
[ 8.33333333, 8.33333333, 8.33333333]])
我已经查看了expected_freq
函数的源代码,因为文档没有太多细节 - 但我仍然不明白为什么我没有看到我所期望的