4

我知道“将多行组合成列表”的问题已经被回答了一百万次,这里引用了一篇很棒的文章:Concatenating row values in transact sql

我需要同时将多行组合成多列的列表

 ID | Col1 | Col2       ID | Col1 | Col2 
------------------  =>  ------------------
  1    A     X           1    A     X    
  2    B     Y           2    B,C   Y,Z
  2    C     Z

我尝试使用 xml 方法,但事实证明这在大型表上非常慢

SELECT DISTINCT
    [ID],
    [Col1] = STUFF((SELECT ',' + t2.[Col1]
                    FROM #Table t2
                    WHERE t2.ID = t.ID
                    FOR XML PATH(''), TYPE).value('.', 'nvarchar(max)'),1,1,''),
    [Col2] = STUFF((SELECT ',' + t2.[Col2]
                    FROM #Table t2
                    WHERE t2.ID = t.ID
                    FOR XML PATH(''), TYPE).value('.', 'nvarchar(max)'),1,1,''),
FROM #Table t

我当前的解决方案是使用一个存储过程,分别构建每个 ID 行。我想知道是否可以使用另一种方法(除了使用循环)

For each column, rank the rows to combine (partition by the key column)

End up with a table like
ID  | Col1 | Col2 | Col1Rank | Col2Rank
1      A      X        1          1
2      B      Y        1          1
2      C      Z        2          2

Create a new table containing top rank columns for each ID
ID  | Col1Comb | Col2Comb
1       A           X
2       B           Y

Loop through each remaining rank in increasing order (in this case 1 iteration)
for irank = 0; irank <= 1; irank++
    update n set
       n.col1Comb = n.Col1Comb + ',' + o.Col1,  -- so append the rank 2 items
       n.col2comb = n.Col2Comb + ',' + o.Col2   -- if they are not null
    from #newtable n
    join #oldtable o
       on o.ID = n.ID
    where o.col1rank = irank or o.col2rank = irank
4

3 回答 3

3

在更新 CTE 的地方可以使用 CTE 技巧。

方法一:新建一个并行表,将数据复制到其中,然后拼接:

CREATE TABLE #Table1(ID INT, Col1 VARCHAR(1), Col2 VARCHAR(1), RowID INT IDENTITY(1,1));
CREATE TABLE #Table1Concat(ID INT, Col3 VARCHAR(MAX), Col4 VARCHAR(MAX), RowID INT);
GO

INSERT #Table1 VALUES(1,'A','X'), (2,'B','Y'), (2,'C','Z');
GO
INSERT #Table1Concat
SELECT * FROM #Table1;
GO
DECLARE @Cat1 VARCHAR(MAX) = '';
DECLARE @Cat2 VARCHAR(MAX) = '';
; WITH CTE AS (
    SELECT TOP 2147483647 t1.*, t2.Col3, t2.Col4, r = ROW_NUMBER()OVER(PARTITION BY t1.ID ORDER BY t1.Col1, t1.Col2)
    FROM #Table1 t1
    JOIN #Table1Concat t2 ON t1.RowID = t2.RowID
    ORDER BY t1.ID, t1.Col1, t1.Col2
)
UPDATE CTE
SET @Cat1 = Col3 = CASE r WHEN 1 THEN ISNULL(Col1,'') ELSE @Cat1 + ',' + Col1 END
, @Cat2 = Col4 = CASE r WHEN 1 THEN ISNULL(Col2,'') ELSE @Cat2 + ',' + Col2 END;
GO

SELECT ID, Col3 = MAX(Col3) 
, Col4 = MAX(Col4)
FROM #Table1Concat
GROUP BY ID

方法2:将串联列直接添加到原始表并串联新列:

CREATE TABLE #Table1(ID INT, Col1 VARCHAR(1), Col2 VARCHAR(1), Col1Cat VARCHAR(MAX), Col2Cat VARCHAR(MAX));
GO

INSERT #Table1(ID,Col1,Col2) VALUES(1,'A','X'), (2,'B','Y'), (2,'C','Z');
GO

DECLARE @Cat1 VARCHAR(MAX) = '';
DECLARE @Cat2 VARCHAR(MAX) = '';
; WITH CTE AS (
    SELECT TOP 2147483647 t1.*, r = ROW_NUMBER()OVER(PARTITION BY t1.ID ORDER BY t1.Col1, t1.Col2)
    FROM #Table1 t1
    ORDER BY t1.ID, t1.Col1, t1.Col2
)
UPDATE CTE
SET @Cat1 = Col1Cat = CASE r WHEN 1 THEN ISNULL(Col1,'') ELSE @Cat1 + ',' + Col1 END
, @Cat2 = Col2Cat = CASE r WHEN 1 THEN ISNULL(Col2,'') ELSE @Cat2 + ',' + Col2 END;
GO

SELECT ID, Col1Cat = MAX(Col1Cat) 
, Col2Cat = MAX(Col2Cat)
FROM #Table1
GROUP BY ID;
GO
于 2013-05-16T05:38:47.387 回答
1

试试这个——

查询1:

DECLARE @temp TABLE
(
      ID INT
    , Col1 VARCHAR(30)
    , Col2 VARCHAR(30)
)

INSERT INTO @temp (ID, Col1, Col2)
VALUES 
    (1, 'A', 'X'),
    (2, 'B', 'Y'),
    (2, 'C', 'Z')

SELECT
      r.ID
    , Col1 = STUFF(REPLACE(REPLACE(CAST(d.x.query('/t1/a') AS VARCHAR(MAX)), '<a>', ','), '</a>', ''), 1, 1, '')
    , Col2 = STUFF(REPLACE(REPLACE(CAST(d.x.query('/t2/a') AS VARCHAR(MAX)), '<a>', ','), '</a>', ''), 1, 1, '')
FROM (
    SELECT DISTINCT ID
    FROM @temp
) r
OUTER APPLY (
    SELECT x = CAST((
        SELECT 
                [t1/a] = t2.Col1
              , [t2/a] = t2.Col2
        FROM @temp t2
        WHERE r.ID = t2.ID
        FOR XML PATH('')
    ) AS XML)
) d

查询 2:

SELECT
      r.ID
    , Col1 = STUFF(REPLACE(CAST(d.x.query('for $a in /a return xs:string($a)') AS VARCHAR(MAX)), ' ,', ','), 1, 1, '') 
    , Col2 = STUFF(REPLACE(CAST(d.x.query('for $b in /b return xs:string($b)') AS VARCHAR(MAX)), ' ,', ','), 1, 1, '') 
FROM (
    SELECT DISTINCT ID
    FROM @temp
) r
OUTER APPLY (
    SELECT x = CAST((
        SELECT 
                [a] = ',' + t2.Col1
              , [b] = ',' + t2.Col2
        FROM @temp t2
        WHERE r.ID = t2.ID
        FOR XML PATH('')
    ) AS XML)
) d

输出:

ID          Col1       Col2
----------- ---------- ----------
1           A          X
2           B,C        Y,Z
于 2013-05-16T06:57:14.927 回答
0

一种至少在语法上直截了当的解决方案是使用用户定义的聚合将值“连接”在一起。这确实需要 SQLCLR,虽然有些人不愿意启用它,但它确实提供了一种基于集合的方法,不需要为每列重新查询基表。连接与拆分相反,它将创建一个逗号分隔的列表,其中包含单独的行。

下面是一个使用 SQL# (SQLsharp) 库的简单示例,该库带有一个名为 Agg_Join() 的用户定义的聚合,它完全符合这里的要求。您可以从http://www.SQLsharp.com/下载免费版本的 SQL#,并从标准系统视图下载示例 SELECT。(公平地说,我是 SQL# 的作者,但此功能是免费提供的)。

SELECT sc.[object_id],
       OBJECT_NAME(sc.[object_id]) AS [ObjectName],
       SQL#.Agg_Join(sc.name) AS [ColumnNames],
       SQL#.Agg_Join(DISTINCT sc.system_type_id) AS [DataTypes]
FROM sys.columns sc
GROUP BY sc.[object_id]

我建议根据您当前的解决方案对此进行测试,看看对于您期望至少在未来一两年内拥有的数据量而言,哪个是最快的。

于 2013-05-16T05:33:36.420 回答