0

我编写了一个怪物查询。我确信它可以被优化,我非常感谢关于查询本身的任何评论/指导;但是,我有一个具体问题:

我返回的数据有时会在多个列上重复:

+-------+------+----------+------+-------+--------+----------+-------+------+
| first | last |  deaID   | cert | count |  npi   | clientid | month | year |
+-------+------+----------+------+-------+--------+----------+-------+------+
| Alex  | Jue  | UNKNOWN  | MD   |    11 | 123123 |   102889 |     7 | 2012 |
| Alex  | Jue  | BJ123123 | MD   |    11 | 123123 |   102889 |     7 | 2012 |
+-------+------+----------+------+-------+--------+----------+-------+------+

如您所见,所有字段均相等,除了deaID

在这种情况下,我只想返回:

+------+-----+----------+----+----+--------+--------+---+------+
|      |     |          |    |    |        |        |   |      |
+------+-----+----------+----+----+--------+--------+---+------+
| Alex | Jue | BJ123123 | MD | 11 | 123123 | 102889 | 7 | 2012 |
+------+-----+----------+----+----+--------+--------+---+------+

但是,如果没有重复:

+-------+------+---------+------+-------+--------+----------+-------+------+
| first | last |  deaID  | cert | count |  npi   | clientid | month | year |
+-------+------+---------+------+-------+--------+----------+-------+------+
| Alex  | Jue  | UNKNOWN | MD   |    11 | 123123 |   102889 |     7 | 2012 |
+-------+------+---------+------+-------+--------+----------+-------+------+

那我想保留它!

总结 如果有重复删除所有记录'deaID=unknown'; 但是,如果只有 1 个匹配项,则返回该匹配项

问题如果有 1 个匹配项 ,我如何返回unknown记录?

这是怪物查询,以防有人感兴趣:)

with ctebiggie  as (

select distinct
p.[IMS_PRESCRIBER_ID],
p.PHYSICIAN_NPI as MLISNPI,
a.CLIENT_ID,
p.MLIS_FIRSTNAME,
p.MLIS_LASTNAME,
p_address.IMS_DEA_NBR,
p.IMS_PROFESSIONAL_ID_NBR,
p.IMS_PROFESSIONAL_ID_NBR_src,
p.IMS_CERTIFICATION_CODE,
datepart(mm,a.RECEIVED_DATE) as [Month],
datepart(yyyy,a.RECEIVED_DATE) as [Year]

from

MILLENNIUM_DW_dev..D_PHYSICIAN p
left outer join
MILLENNIUM_DW_dev..F_ACCESSION_DAILY a
on a.REQUESTOR_NPI=p.PHYSICIAN_NPI
left outer join MILLENNIUM_DW_dev..D_PHYSICIAN_ADDRESS p_address
on p.PHYSICIAN_NPI=p_address.PHYSICIAN_NPI

where 
a.RECEIVED_DATE is not null
--and p.IMS_PRESCRIBER_ID is not null
--and p_address.IMS_DEA_NBR !='UNKNOWN'
and p.REC_ACTIVE_FLG=1
and p_address.REC_ACTIVE_FLG=1
and DATEPART(yyyy,received_date)=2012
  and DATEPART(mm,received_date)=7


group by 
p.[IMS_PRESCRIBER_ID],
p.PHYSICIAN_NPI,
p.IMS_PROFESSIONAL_ID_NBR,
p.MLIS_FIRSTNAME,
p.MLIS_LASTNAME,
p_address.IMS_DEA_NBR,
p.IMS_PROFESSIONAL_ID_NBR,
p.IMS_PROFESSIONAL_ID_NBR_src,
p.IMS_CERTIFICATION_CODE,
datepart(mm,a.RECEIVED_DATE),
datepart(yyyy,a.RECEIVED_DATE),
a.CLIENT_ID

)
,
ctecount as 
(select
 COUNT (Distinct f.ACCESSION_ID) [count],
 f.REQUESTOR_NPI,f.CLIENT_ID,
 datepart(mm,f.RECEIVED_DATE) mm,
datepart(yyyy,f.RECEIVED_DATE)yyyy
from MILLENNIUM_DW_dev..F_ACCESSION_DAILY f

where 
 f.CLIENT_ID not in (select * from SalesDWH..TestPractices)

 and DATEPART(yyyy,f.received_date)=2012
  and DATEPART(mm,f.received_date)=7


group by f.REQUESTOR_NPI,
f.CLIENT_ID,
datepart(mm,f.RECEIVED_DATE),
datepart(yyyy,f.RECEIVED_DATE)
)

select ctebiggie.*,c.* from
ctebiggie
full outer join
ctecount c
on c.REQUESTOR_NPI=ctebiggie.MLISNPI
and c.mm=ctebiggie.[Month]
and c.yyyy=ctebiggie.[Year]
and c.CLIENT_ID=ctebiggie.CLIENT_ID
4

2 回答 2

3

假设您有基本查询,我将在此结果集上按分区函数分配 row_number 和计数。然后在外部选择中,如果计数为 1,则选择未知,否则不选择。

SELECT first,
       last,
       deaID,
       cert,
       count,
       npi,
       clientid,
       month,
       year
  FROM (
         SELECT first,
                last,
                deaID,
                cert,
                count,
                npi,
                clientid,
                month,
                year,
                ROW_NUMBER() OVER (PARTITION BY
                                     first,last,cert,count,npi,clientid,month,year 
                                    ORDER BY CASE WHEN deaID = 'Unkown' THEN 0 ELSE 1 END,
                                       deaID) AS RowNumberInGroup,
                COUNT() OVER (PARTITION BY first,last,cert,count,npi,clientid,month,year)
                    AS CountPerGroup,
                 SUM(CASE WHEN deaID = 'Unkown' THEN 1 ELSE 0 END) 
                     OVER (PARTITION BY first,last,cert,count,npi,clientid,month,year)
                     AS UnknownCountPerGroup
           FROM BaseQuery
      ) T
 WHERE (T.CountPerGroup = T.UnknownCountPerGroup AND T.RowNumberInGroup = 1) OR T.RowNumberInGroup > T.UnknownCountPerGroup
于 2012-09-07T17:30:46.313 回答
2

看看这有没有帮助

select distinct main.col1,main.col2  ,
       isnull(( select col3 from table1 where table1.col1=main.col1
       and table1.col2=main.col2 and col3 <>'UNKNOWN'),'UNKNOWN')
from   table1 main

Sql fiddle 中的示例

或者你的公平版本将是

SELECT distinct first,
       last,
       cert,
       count,
       npi,
       clientid,
       month,
       year,
      isnull(
      select top 1 dealid from table1 intable where 
      intable.first=maintable.first and
      intable.last=maintable.last and
      intable.cert=maintable.cert and
      intable.npi=maintable.npi and
      intable.clientid=outtable.clientid and
      intable.month=outtable.month and
      intable.year=outtable.year
      where dealid<>'UNKNOWN'),'UNKNOWN') as dealId
FROM  table1 maintable
于 2012-09-07T17:30:08.683 回答