0

我正在尝试构建一个 2x2 列联表,如下面的链接中所述:

即席 2x2 列联表 SQL Server 2008 (试图理解代码但无法理解它)

循环构造成对,如 C1,C1 C1,C2 C2,C1 C2,C2。(笛卡尔积)

这些对作为参数提供给 sql 代码。对于这个例子,我已经给了 sql 代码一对 --> C1,C1

当为不同的对构造它时,它们是正确的,如 C1、C2、C2、C1(经过下面解释的一些修改)。当制作成对的 C1,C1 或 C2,C2 时,它会构造一个错误的列联表。

例如(表名是 alpha_occurence):

id   concept_uri   document_uri

1       C1      D1

2       C2      D1

C1,C1 对的 2x2 列联表应从上表给出:

       C1     not C1
    C1  1     0
not C1  0     -

而是给出(经过一些修改):

       C1    not C1
    C1  0    1
not C1  1    -

请注意,我已将 - 用于值不是 C1,不是 C1。因为要计算使用了另一种方法。

此 sql 代码用于检索值:

SELECT count(*) AS total FROM  
(SELECT document_uri,count(DISTINCT concept_uri) AS count_conc FROM mydb.alpha_occurence 
WHERE concept_uri IN ('C1','C1') 
GROUP BY document_uri 
HAVING count_conc >=2 ) 
AS amount_of_concept_co_occurence #value of both X and Y

UNION ALL 

SELECT count(*) AS total FROM 
(SELECT concept_uri,document_uri FROM mydb.alpha_occurence
WHERE concept_uri IN ('C1'))
AS only_concept_A #value of Only X not Y

UNION ALL 

SELECT count(*) AS total FROM
(SELECT concept_uri,document_uri FROM mydb.alpha_occurence 
WHERE concept_uri IN ('C1'))
AS only_concept_B #value of Not X only Y

检索到值后,将在这些值上运行一个小脚本以更正它们。完成以下操作:

To get Only X and not Y            = only_concept_A - amount_of_concept_co_occurence 
To get Not X and Only Y            = Only_concept_B - amount_of_concept_co_occurence
To get the value of neither X or Y = total # of documents (which is not given here as the sample data only has data of which concept occurce in which document) - (amount_of_concept_co_occurence + Only X and not Y + Not X and Only Y) 
4

1 回答 1

1

我用了这个脚本

select concept_uri, document_uri, count(*) as count 
from table
group by concept_uri, document_uri

他们准备好了..

于 2017-02-12T16:42:17.450 回答