sql - 在 SQL 中，根据分组创建对并计算它们的频率

Question

我想要的是为每个 ID 创建第 2 列的 DISTINCT 对并按计数排列它们。
让我们以此表为例：

CREATE TABLE mytable
    (`ID` int, `C2` varchar(1), `C3` varchar(2))
;
    
INSERT INTO mytable
    (`ID`, `C2`, `C3`)
VALUES
    (1, 'A',' a1'),
    (1, 'B', 'b1'),
    (2, 'A', 'a2'),
    (3, 'A', 'a3'),
    (3, 'C', 'c3'),
    (3, 'A', 'a4'),
    (4, 'A', 'a1'),
    (4, 'B', 'b4'),
    (4, 'A', 'a2'),
    (4, 'D', 'd1');

对于 1，pair 将是 AB。
对于 2，一个将不存在。
对于 3，pair 将是 AC。
对于 4，对将是 AB、AD 和 BD。

所以输出将是：

| Pair | Cnt |
| A-B  | 2   |
| A-C  | 1   |
| A-D  | 1   | 
| B-D  | 1   |

这是我们可以在 SQL 中使用类似的东西GROUP_CONCAT吗？
几天来我一直在思考这个问题，但仍然想不出一个简单的解决方案。

谢谢！

score 1 · Accepted Answer

我认为这是一个自加入和计数不同的。一种方法是：

select t1.c2, t2.c2, count(distinct t1.id) as cnt
from t t1 join 
     t t2
     on t1.id = t2.id and t1.c2 < t2.c2
group by t1.c2, t2.c2
order by cnt desc;

根据您的数据，先删除重复项然后加入可能会更有效：

with tt as (
      select distinct t.id, t.c2
      from t
     )
select t1.c2, t2.c2, count(t1.id) as cnt
from tt t1 join 
     tt t2
     on t1.id = t2.id and t1.c2 < t2.c2
group by t1.c2, t2.c2
order by cnt desc;

score 0 · Accepted Answer

您可以自加入和聚合：

select t1.c2 c21, t2.c2 c22, count(distinct t1.id) cnt 
from mytable t1
inner join mytable t2
    on  t1.id = t2.id
    and t1.c2  < t2.c2
group by t1.c2, t2.c2

sql - 在 SQL 中，根据分组创建对并计算它们的频率

2 回答 2

Related

Reference