我试图从我对示例数据的以下 SQL Server 查询中找到卡方检验:
SELECT sessionnumber, sessioncount, timespent, expected, dev, dev*dev/expected as chi_square
FROM (SELECT clusters.sessionnumber, clusters.sessioncount, clusters.timespent,
(dim1.cnt * dim2.cnt * dim3.cnt)/(dimall.cnt*dimall.cnt) as expected,
clusters.cnt-(dim1.cnt * dim2.cnt * dim3.cnt)/(dimall.cnt*dimall.cnt) as dev
FROM clusters JOIN
(SELECT sessionnumber, SUM(cnt) as cnt FROM clusters
GROUP BY sessionnumber) dim1 ON clusters.sessionnumber = dim1.sessionnumber JOIN
(SELECT sessioncount, SUM(cnt) as cnt FROM clusters
GROUP BY sessioncount) dim2 ON clusters.sessioncount = dim2.sessioncount JOIN
(SELECT timespent, SUM(cnt) as cnt FROM clusters
GROUP BY timespent) dim3 ON clusters.timespent = dim3.timespent CROSS JOIN
(SELECT SUM(cnt) as cnt FROM clusters) dimall) a
我的表有这种样本数据:
sessionnumber sessioncount timespent cnt
1 17 28 NULL
2 22 8 NULL
3 1 1 NULL
4 1 1 NULL
5 8 111 NULL
6 8 65 NULL
7 11 5 NULL
8 1 1 NULL
9 62 64 NULL
10 6 42 NULL
问题是这个查询工作正常,但它给出了错误的输出,或者你可以说根本没有输出。它给我的输出是这样的:
sessionnumber sessioncount timespent expected dev chi_square
1 17 28 NULL NULL NUL
2 22 8 NULL NULL NULL
3 1 1 NULL NULL NULL
4 1 1 NULL NULL NULL
5 8 111 NULL NULL NULL
6 8 65 NULL NULL NULL
7 11 5 NULL NULL NULL
8 1 1 NULL NULL NULL
9 62 64 NULL NULL NULL
10 6 42 NULL NULL NULL
我怎么能摆脱这个问题,因为我已经尽力了!提前感谢告诉我我做错了什么!