你有几个选择来做这样的PIVOT
.
这是使用 U-SQL MAP 数据类型(称为SQL.MAP
)的一种。它会为缺失值返回 null 而不是 0(使用 null 合并表达式将其变为 0)这将在以下条件下工作:
- 生成的 MAP 保持在 4MB 的行大小限制内。如果没有,请参阅下一个解决方案。
- 您提前知道您拥有哪些列(如果没有,只需将数据保留在地图列中并根据需要提取)。
地图解决方案:
@t = SELECT *
FROM(
VALUES
( 1, "A", 30 ),
( 1, "B", 70 ),
( 1, "ZA", 12 ),
( 2, "C", 22 ),
( 2, "A", 13 ),
( 2, "ABC", 42)
) AS T(ColA, ColB, ColC);
@m = SELECT ColA AS [ID],
MAP_AGG(ColB, (int?) ColC) AS m
FROM @t
GROUP BY ColA;
@r =
SELECT [ID],
m["A"]AS A,
m["B"]AS B,
m["C"]AS C,
m["ZA"]AS [ZA],
m["ABC"]AS [ABC]
FROM @m;
OUTPUT @r
TO "/output/pivot1.csv"
USING Outputters.Csv();
这是一个使用标准 SQL 枢轴变通模式的解决方案(一些 SQL 数据库实现实际上用于在内部将 PIVOT 表达式转换为这样的表达式,并且可能仍然这样做)。同样,您必须提前了解所有列。如果不是这种情况,只需使用 MAP 数据类型。
@t =
SELECT *
FROM(
VALUES
( 1, "A", 30 ),
( 1, "B", 70 ),
( 1, "ZA", 12 ),
( 2, "C", 22 ),
( 2, "A", 13 ),
( 2, "ABC", 42)
) AS T(ColA, ColB, ColC);
@r =
SELECT ColA AS [ID],
(ColB == "A") ? ColC : 0 AS A,
(ColB == "B") ? ColC : 0 AS B,
(ColB == "C") ? ColC : 0 AS C,
(ColB == "ZA") ? ColC : 0 AS [ZA],
(ColB == "ABC") ? ColC : 0 AS [ABC]
FROM @t;
@r =
SELECT DISTINCT [ID],
LAST_VALUE(A) OVER(PARTITION BY [ID] ORDER BY A) AS A,
LAST_VALUE(B) OVER(PARTITION BY [ID] ORDER BY B) AS B,
LAST_VALUE(C) OVER(PARTITION BY [ID] ORDER BY C) AS C,
LAST_VALUE([ZA]) OVER(PARTITION BY [ID] ORDER BY [ZA]) AS [ZA],
LAST_VALUE([ABC]) OVER(PARTITION BY [ID] ORDER BY [ABC]) AS [ABC]
FROM @r;
OUTPUT @r
TO "/output/pivot2.csv"
USING Outputters.Csv();