请帮助我,因为我一直在尝试使用 SQL SERVER 2008 R2 Developers Edition 找出 CHI-SQUARED 测试。问题是查询在以下一组示例数据上运行良好:
sessionnumber sessioncount timespent cnt
1 17 28 45
2 22 8 30
3 1 1 2
4 1 1 2
5 8 111 119
6 8 65 73
7 11 5 16
8 1 1 2
9 62 64 126
10 6 42 48
所以,我一直在尝试的查询是:
SELECT sessionnumber, sessioncount, timespent, expected, dev,
dev*dev/cast(expected as float) as chi_square
FROM (SELECT d3.sessionnumber, d3.sessioncount, d3.timespent,
(dim1.cnt * dim2.cnt * dim3.cnt)/cast((dimall.cnt*dimall.cnt)as float) as expected,
d3.cnt-(dim1.cnt * dim2.cnt * dim3.cnt)/(dimall.cnt*dimall.cnt) as dev FROM d3 JOIN
(SELECT sessionnumber, SUM(cast(cnt as float)) as cnt FROM d3
GROUP BY sessionnumber) dim1
ON d3.sessionnumber = dim1.sessionnumber JOIN
(SELECT sessioncount, SUM(cast(cnt as float)) as cnt FROM d3
GROUP BY sessioncount) dim2
ON d3.sessioncount = dim2.sessioncount JOIN
(SELECT timespent, SUM(cast(cnt as float)) as cnt FROM d3
GROUP BY timespent) dim3
ON d3.timespent = dim3.timespent CROSS JOIN
(SELECT SUM(cast(cnt as float)) as cnt FROM d3) dimall) a
此查询生成的结果是错误的,结果是:
sessionnumber sessioncount timespent expected dev chi_square
1 17 28 2.37921034130308E-09 44.9999999976208 851122729517.387
2 22 8 1.72099699796333E-10 29.9999999998279 5229526844351.02
3 1 1 1.3008335197251E-11 1.99999999998699 307495151323.689
4 1 1 1.3008335197251E-11 1.99999999998699 307495151323.689
5 8 111 1.90995107994937E-07 118.999999809005 74143260019.6156
6 8 65 5.09110109296227E-09 72.9999999949089 1046728379961.52
7 11 5 5.36406353430159E-11 15.9999999999464 4772501264409.71
8 1 1 1.3008335197251E-11 1.99999999998699 307495151323.689
9 62 64 6.56781317803123E-09 125.999999993432 2417242934291.85
10 6 42 1.41737398829092E-09 47.9999999985826 1625541331291.19
作为 sessionnumber 1 和 sessionnumber 2 的正确 Chi Square 测试应该等于 9.117,因为我的查询给了我错误的结果。(此卡方是前 2 个 sessionnumbers 行的示例但正确的值)。因此,自过去 3 天以来,我一直在努力寻找答案并努力工作。最后发现我的这个查询有问题,它给了我错误的结果。
请有人帮助我,我会对此有所帮助!(我也会在 2 天后就这个问题申请赏金)。提前谢谢请帮助我,因为我对 SQL 查询有一点了解,因为我对它很陌生,因为我只使用了大约 3 个月!所以我真的需要一些帮助!