1

我正在使用 SQL 2008 并尝试批量处理表中的数据,但是有一个问题。数据被分成组,当我进行处理时,我必须确保一个组将始终包含在一个批次中,或者换句话说,该组永远不会被拆分到不同的批次中。假设批量大小总是比组大小大得多。这是说明我的意思的设置(代码使用 Jeff Moden 的数据生成逻辑:http ://www.sqlservercentral.com/articles/Data+Generation/87901 )

DECLARE @NumberOfRows INT = 1000,
    @StartValue   INT = 1,
    @EndValue     INT = 500,
    @Range        INT

SET @Range = @EndValue - @StartValue + 1

IF OBJECT_ID('tempdb..#SomeTestTable','U') IS NOT NULL
DROP TABLE #SomeTestTable;

SELECT TOP (@NumberOfRows)
GroupID = ABS(CHECKSUM(NEWID())) % @Range + @StartValue
INTO #SomeTestTable
FROM sys.all_columns ac1
CROSS JOIN sys.all_columns ac2

这将创建一个表,其中包含大约 435 组记录,每组包含 1 到 7 条记录。现在,假设我想分批处理这些记录,每批 100 条记录。如何确保我的 GroupID 不会在不同批次之间拆分?如果每批不完全是 100 条记录,我很好,它可能会多一点或少一点。

我很感激任何建议!

4

1 回答 1

1

这将导致批次略小于 100 个条目,它将删除所有未完全在选择中的组;

WITH cte AS (SELECT TOP 100 * FROM (
  SELECT GroupID, ROW_NUMBER() OVER (PARTITION BY GroupID ORDER BY GroupID) r
  FROM #SomeTestTable) a
  ORDER BY GroupID, r DESC)
SELECT c1.GroupID FROM cte c1
  JOIN cte c2
    ON c1.GroupID = c2.GroupID
   AND c2.r = 1

It'll select the groups with the lowest GroupID's, limited to 100 entries into a common table expression along with the row number, then it'll use the row number to throw away any groups that aren't entirely in the selection (row number 1 needs to be in the selection for the group to be, since the row number is ordered descending before cutting with TOP).

于 2013-02-14T06:54:28.827 回答