这就是所谓的集合内集合问题。我认为在任何类别中找到匹配项的最佳方法是以下方法:
select ADVERTISEMENTID
from t
group by ADVERTISEMENTID
having sum(case when categoryid = 'A' then 1 else 0 end) > 0 or
sum(case when categoryid = 'B' then 1 else 0 end) > 0 or
sum(case when categoryid = 'C' then 1 else 0 end) > 0
换句话说,这是advertisementid
对每个类别值进行汇总并进行单独比较。这些sum()
陈述正在计算它存在的数量。or
就是说这些中的任何一个都必须是真的。
对于子集关系,我再添加一个子句来计算不匹配项:
select ADVERTISEMENTID
from t
group by ADVERTISEMENTID
having (sum(case when categoryid = 'A' then 1 else 0 end) > 0 or
sum(case when categoryid = 'B' then 1 else 0 end) > 0 or
sum(case when categoryid = 'C' then 1 else 0 end) > 0
) and
sum(case when categoryid in ('A', 'B', 'C') then 0 else 1 end) = 0
我喜欢这种方法的原因是因为它很有表现力。如果我们将 更改or
为and
,那么我们要求所有三个类别:
select ADVERTISEMENTID
from t
group by ADVERTISEMENTID
having sum(case when categoryid = 'A' then 1 else 0 end) > 0 and
sum(case when categoryid = 'B' then 1 else 0 end) > 0 and
sum(case when categoryid = 'C' then 1 else 0 end) > 0
如果我们想要至少两个匹配项,我们可以添加count(distinct)
:
select ADVERTISEMENTID
from t
group by ADVERTISEMENTID
having (sum(case when categoryid = 'A' then 1 else 0 end) > 0 or
sum(case when categoryid = 'B' then 1 else 0 end) > 0 or
sum(case when categoryid = 'C' then 1 else 0 end) > 0
) and
count(distinct categoryid) >= 2
等等。