先发表一些评论:
FETCH
目前仅将文字作为参数(https://msdn.microsoft.com/en-us/library/azure/mt621321.aspx)
@var = SELECT ...
将名称分配给以 .@var
开头的行集表达式SELECT
。U-SQL(当前)不为您提供来自查询结果的有状态标量变量分配。相反,您将使用 aCROSS JOIN
或 otherJOIN
来加入标量值。
现在到解决方案:
要获得百分比,请查看ROW_NUMBER()
和PERCENT_RANK()
函数。例如,以下内容向您展示了如何使用其中之一来回答您的问题。PERCENT_RANK()
鉴于(不需要MAX()
and )的更简单代码CROSS JOIN
,我建议使用该解决方案。
DECLARE @percentage double = 0.25; // 25%
@data = SELECT *
FROM (VALUES(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13),(14),(15),(16),(17),(18),(19),(20)
) AS T(pos);
@data =
SELECT PERCENT_RANK() OVER(ORDER BY pos) AS p_rank,
ROW_NUMBER() OVER(ORDER BY pos) AS r_no,
pos
FROM @data;
@cut_off =
SELECT ((double) MAX(r_no)) * (1.0 - @percentage) AS max_r
FROM @data;
@r1 =
SELECT *
FROM @data CROSS JOIN @cut_off
WHERE ((double) r_no) > max_r;
@r2 =
SELECT *
FROM @data
WHERE p_rank >= 1.0 - @percentage;
OUTPUT @r1
TO "/output/top_perc1.csv"
ORDER BY p_rank DESC
USING Outputters.Csv();
OUTPUT @r2
TO "/output/top_perc2.csv"
ORDER BY p_rank DESC
USING Outputters.Csv();