2

这是表格的简化视图。我很抱歉,但我无法保存这张桌子的照片,所以我希望这没问题。

c1 ___c2
1 ____a
1 ____b
2 ____a
2 ____b
2 ____c
2 ____d
3 ____e
3 ____a
4 ____z
5 ____d

结果是由于 C2 列的关系,第 1 组将包括 1、2、3、5(因为它们具有重叠的 c2 值,基本上说明 a=b=c=d=e)第 2 组将包括 4

我有数百万行包含此类数据,目前有一个游标作业运行 x 次来构建这些组。我能够想象这应该如何工作,但我无法构建一个可以提取这种关系的查询。有什么建议么?谢谢

4

1 回答 1

0

在 SQL Server 2012 上测试:

WITH t AS (
    SELECT
        t.c1,
        t.c2,
        tm.c1_min
    FROM
        Test t
    JOIN
        (
            SELECT
                c2,
                MIN(c1) AS c1_min
            FROM
                Test
            GROUP BY
                c2
        ) AS tm
    ON
        t.c2 = tm.c2
),
rt AS (
    SELECT
        c1_min,
        c1,
        1 AS cnt
    FROM
        t
UNION ALL
    SELECT
        rt.c1_min,
        t.c1,
        rt.cnt + 1 AS cnt
    FROM
        rt
    JOIN
        t
    ON
        rt.c1 = t.c1_min
    AND
        rt.c1 < t.c1
)
SELECT
    SUM(t.rst) OVER (ORDER BY t.ord ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS group_number,
    t.c1
FROM
    (
        SELECT
            t.c1,
            t.rst,
            t.ord
        FROM
            (
                SELECT
                    rt.c1,
                    CASE
                        WHEN rt.c1_min = MIN(rt.c1_min) OVER (ORDER BY rt.c1_min, rt.c1 ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) THEN 0
                        ELSE 1
                    END AS rst,
                    ROW_NUMBER() OVER (ORDER BY rt.c1_min, rt.c1) AS ord,
                    ROW_NUMBER() OVER (PARTITION BY rt.c1 ORDER BY rt.c1_min, rt.cnt) AS qfy
                FROM
                    rt
            ) AS t
        WHERE
            t.qfy = 1
    ) AS t;
于 2013-11-09T16:23:25.360 回答