0

We are using full recovery model in SQL Server. We have a job which merges from a staging table to the final table. The staging table is holding millions of rows. The final table is also huge with millions of rows. We are merging in batches of 10,000 rows.

The pseudo code is given for a single batch below:

BEGIN TRANSACTION

DELETE TOP 10000 * 
FROM <Staging Table> 
OUTPUT deleted.* INTO @TableVariable

MERGE INTO <Final Table> 
USING @TableVariable

COMMIT TRANSACTION

The problem is, the batch operation is getting slower, for every new batch. When we restart the server, the batches are getting faster again. The transactions are not getting written to disk also and taking very long time to insert to disk. We suspect it to be problem with transaction log. When we reduce the batch size, more transactions are happening and batches are slowing down even more.

Is there a way to improve the performance of this kind of batched delete & merge operation? Do you recommend to use CHECKPOINT to force in full recovery model?

4

2 回答 2

1

合并操作通常可以通过避免多余的更新来改进。如果由于目标行和源行相等而没有要更新的内容,则不要更新该行。这在大多数行没有更改的情况下非常有效,因为 SQL Server 在事务日志上写入的信息要少得多。

为避免对合并操作进行多余的更新,请编写如下合并语句:

MERGE INTO target AS t
USING source AS s
ON t.id = s.id
WHEN MATCHED 
  AND ((t.col1 <> s.col1 
       OR t.col1 IS NULL AND s.col1 IS NOT NULL
       OR t.col1 IS NOT NULL AND s.col1 IS NULL)
  OR (t.col2 <> s.col2 
       OR t.col2 IS NULL AND s.col2 IS NOT NULL
       OR t.col2 IS NOT NULL AND s.col2 IS NULL)
  OR (t.col2 <> s.col3 
       OR t.col3 IS NULL AND s.col3 IS NOT NULL
       OR t.col3 IS NOT NULL AND s.col3 IS NULL))
THEN UPDATE SET
  col1 = s.col1, col2 = s.col2, col3 = s.col3
WHEN NOT MATCHED BY TARGET THEN 
    INSERT (id, col1, col2, col3)
    VALUES (s.id, s.col1, s.col2, s.col3);
于 2019-11-26T18:23:54.357 回答
0

我们所做的是,我们没有强制执行 CHECKPOINT 流程,而是在 WHILE LOOP 中引入了人为延迟,这样事务就不会受到限制。

我们能够克服由于 SQL Server 环境中的事务限制而导致的内存不足问题。我们在临时表中有数百万行。引入的这 10,000 个批处理和延迟确保我们不会使服务器超载。有人访问服务器。


DECLARE @RowCount INT;

SET @RowCount = (SELECT COUNT(*) FROM StagingTable);

WHILE (@RowCount > 0)
BEGIN

    BEGIN TRANSACTION

    DELETE TOP 10000 * 
    FROM <Staging Table> 
    OUTPUT deleted.* INTO @TableVariable

    MERGE INTO <Final Table> 
    USING @TableVariable

    COMMIT TRANSACTION

    WAITFOR DELAY '00:00:10'; --artificially introduce 10 seconds delay

    SET @RowCount = (SELECT COUNT(*) FROM StagingTable);

END 

于 2019-11-26T06:19:29.737 回答