1

逻辑回归是由一个唯一标识数字组成,后跟多个二进制变量(始终为 1 或 0),具体取决于一个人是否符合某些标准。下面我有一个查询,列出了其中几个二进制条件。只有四个这样的标准,查询的运行时间比我想象的要长一些。有没有比下面更有效的方法?笔记。tblicd 是一个大型表查找表,具有 15k+ 行的文本表示。该查询没有真正的意义,只是一个概念证明。我的复合键上有正确的索引。

select  patient.patientid 
,case when exists
(
    select c.patientid from tblclaims as c
    inner join patient as p on p.patientid=c.patientid
    and c.admissiondate = p.admissiondate
    and c.dischargedate = p.dischargedate
    where patient.patientid = p.patientid
    group by c.patientid
    having count(*) > 1000
    )
    then '1' else '0'
    end as moreThan1000
,case when exists
(
    select c.patientid from tblclaims as c
    inner join patient as p on p.patientid=c.patientid
    and c.admissiondate = p.admissiondate
    and c.dischargedate = p.dischargedate
    where patient.patientid = p.patientid
    group by c.patientid
    having count(*) > 1500
    )
    then '1' else '0'
    end as moreThan1500
,case when exists
(
    select distinct picd.patientid from patienticd as picd
    inner join patient as p on p.patientid= picd.patientid
    and picd.admissiondate = p.admissiondate
    and picd.dischargedate = p.dischargedate
    inner join tblicd as t on t.icd_id = picd.icd_id
    where t.descrip like '%diabetes%' and patient.patientid = picd.patientid
    )
    then '1' else '0'
    end as diabetes
,case when exists
(
    select r.patientid, count(*) from patient as r
    where r.patientid = patient.patientid
    group by r.patientid
    having count(*) >1
    ) 
    then '1' else '0'
    end 


from patient
order by moreThan1000 desc
4

2 回答 2

2

我将从在 from 子句中使用子查询开始:

select q.patientid, moreThan1000, moreThan1500,
       (case when d.patientid is not null then 1 else 0 end),
       (case when pc.patientid is not null then 1 else 0 end)
from patient p left outer join
     (select c.patientid,
             (case when count(*) > 1000 then 1 else 0 end) as moreThan1000,
             (case when count(*) > 1500 then 1 else 0 end) as moreThan1500
      from tblclaims as c inner join
           patient as p
           on p.patientid=c.patientid and
              c.admissiondate = p.admissiondate and
              c.dischargedate = p.dischargedate
      group by c.patientid
     ) q
     on p.patientid = q.patientid left outer join
     (select distinct picd.patientid
      from patienticd as picd inner join
           patient as p
           on p.patientid= picd.patientid and
              picd.admissiondate = p.admissiondate and
              picd.dischargedate = p.dischargedate inner join
          tblicd as t
          on t.icd_id = picd.icd_id
      where t.descrip like '%diabetes%'
     ) d
     on p.patientid = d.patientid left outer join
     (select r.patientid, count(*) as cnt
      from patient as r
      group by r.patientid
      having count(*) >1
     ) pc
     on p.patientid = pc.patientid
order by 2 desc

然后,您可以通过组合它们来进一步简化这些子查询(例如,外部查询上的“p”和“pc”可以组合成一个)。但是,如果没有相关的子查询,SQL Server 应该会发现更容易优化查询。

于 2012-07-24T13:43:27.437 回答
1

根据要求的左连接示例...

SELECT
    patientid,
    ISNULL(CondA.ConditionA,0) as IsConditionA,
    ISNULL(CondB.ConditionB,0) as IsConditionB,
    ....
FROM
    patient
        LEFT JOIN
    (SELECT DISTINCT patientid, 1 as ConditionA from ... where ... ) CondA
        ON patient.patientid = CondA.patientID
        LEFT JOIN
    (SELECT DISTINCT patientid, 1 as ConditionB from ... where ... ) CondB
        ON patient.patientid = CondB.patientID

如果您的 Condition 查询最多只返回一行,您可以将它们简化为

    (SELECT patientid, 1 as ConditionA from ... where ... ) CondA
于 2012-07-24T13:52:08.863 回答