这就是所谓的集合内集合问题。我认为在任何类别中找到匹配项的最佳方法是以下方法:
select ADVERTISEMENTID
from t
group by ADVERTISEMENTID
having sum(case when categoryid = 'A' then 1 else 0 end) > 0 or
sum(case when categoryid = 'B' then 1 else 0 end) > 0 or
sum(case when categoryid = 'C' then 1 else 0 end) > 0
换句话说,这是advertisementid对每个类别值进行汇总并进行单独比较。这些sum()陈述正在计算它存在的数量。or就是说这些中的任何一个都必须是真的。
对于子集关系,我再添加一个子句来计算不匹配项:
select ADVERTISEMENTID
from t
group by ADVERTISEMENTID
having (sum(case when categoryid = 'A' then 1 else 0 end) > 0 or
sum(case when categoryid = 'B' then 1 else 0 end) > 0 or
sum(case when categoryid = 'C' then 1 else 0 end) > 0
) and
sum(case when categoryid in ('A', 'B', 'C') then 0 else 1 end) = 0
我喜欢这种方法的原因是因为它很有表现力。如果我们将 更改or为and,那么我们要求所有三个类别:
select ADVERTISEMENTID
from t
group by ADVERTISEMENTID
having sum(case when categoryid = 'A' then 1 else 0 end) > 0 and
sum(case when categoryid = 'B' then 1 else 0 end) > 0 and
sum(case when categoryid = 'C' then 1 else 0 end) > 0
如果我们想要至少两个匹配项,我们可以添加count(distinct):
select ADVERTISEMENTID
from t
group by ADVERTISEMENTID
having (sum(case when categoryid = 'A' then 1 else 0 end) > 0 or
sum(case when categoryid = 'B' then 1 else 0 end) > 0 or
sum(case when categoryid = 'C' then 1 else 0 end) > 0
) and
count(distinct categoryid) >= 2
等等。