我在将行 ID 设置为 TINYINT 数据类型的查找表中插入值时出现算术溢出错误。这不是唯一记录数超过 255 个值的情况。这有点不寻常,并且在此设置的第一次测试期间没有发生。
下面代码的生产版本实际上只有 66 个唯一值,但随着时间的推移,可能会添加新值(缓慢且数量非常少)...... 255 个可用插槽对于这个生命周期来说应该绰绰有余分析过程。
我最初的想法是,这可能是由于缓存计划识别分层源表具有超过 255 个值(实际上有 1028 个),并评估这可能超出目标表的容量。然而,我已经测试过这是不正确的。
-- This table represents a small (tinyint) subset of unique primary values.
CREATE TABLE #tmp_ID10T_Test (
ID10T_Test_ID tinyint identity (1,1) not null,
ID10T_String varchar(255) not null
PRIMARY KEY CLUSTERED
(ID10T_String ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = ON, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
) ON [PRIMARY]
-- This table represents a larger (smallint) set of non-unique source values, defined by a secondary key value (Value_Set).
CREATE TABLE #tmp_ID10T_Values (
ID10T_Value_ID smallint identity (1,1) not null,
ID10T_Value_Set tinyint not null,
ID10T_String varchar(255) not null
) ON [PRIMARY]
-- Create the initial dataset - 100 unique records; The insertion tests below illustrate that the INDEX is working
-- correctly on the primary key field for repetative values, however something is happening with the IDENTITY field...
DECLARE @ID10T tinyint
, @i tinyint -- A randomized value to determine which subset of printable ASCII characters will be used for the string.
, @String varchar(255)
SET @ID10T = 0
WHILE @ID10T < 100
BEGIN
SET @String = ''
WHILE LEN(@String) < (1+ROUND((254 * RAND(CHECKSUM(NEWID()))),0))
BEGIN
SELECT @i = (1 + ROUND((2 * RAND()),0)) -- Randomize which printable character subset is drawn from.
SELECT @String = @String + ISNULL(CASE WHEN @i = 1 THEN char(48 + ROUND(((57-48)* RAND(CHECKSUM(NEWID()))),0))
WHEN @i = 2 THEN char(65 + ROUND(((90-65) * RAND(CHECKSUM(NEWID()))),0))
WHEN @i = 3 THEN char(97 + ROUND(((122-97) * RAND(CHECKSUM(NEWID()))),0))
END,'-')
END
INSERT INTO #tmp_ID10T_Values (ID10T_Value_Set, ID10T_String)
SELECT 1, @String
SET @ID10T = @ID10T + 1
END
-- Demonstrate that IGNORE_DUP_KEY = ON works for primary key index on string-field
SELECT * FROM #tmp_ID10T_Values
-- Method 1 - Simple INSERT INTO: Expect Approx. (100 row(s) affected)
INSERT INTO #tmp_ID10T_Test (ID10T_String)
SELECT DISTINCT ID10T_String
FROM #tmp_ID10T_Values
GO
-- Method 2 - LEFT OUTER JOIN WHERE NULL to prevent dupes.
-- this is the test case to determine whether the procedure cache is mixing plans
INSERT INTO #tmp_ID10T_Test (ID10T_String)
SELECT DISTINCT T1.ID10T_String
FROM #tmp_ID10T_Values AS T1
LEFT OUTER JOIN #tmp_ID10T_Test AS t2
ON T1.ID10T_String = T2.ID10T_String
WHERE T2.ID10T_Test_ID IS NULL
GO
-- Repeat Method 1: Duplicate key was ignored (0 row(s) affected).
INSERT INTO #tmp_ID10T_Test (ID10T_String)
SELECT DISTINCT ID10T_String
FROM #tmp_ID10T_Values
GO
这似乎不是查询计划缓存问题 - 如果这是真的,我应该会看到方法 1 重新测试的算术错误。
-- Repeat Method 1: Expected: Arithmetic overflow error converting IDENTITY to data type tinyint.
INSERT INTO #tmp_ID10T_Test (ID10T_String)
SELECT DISTINCT ID10T_String
FROM #tmp_ID10T_Values
GO
我特别好奇为什么会抛出异常。我可以理解,在方法 1 中测试了所有 100 个唯一值......因此可以想象,查询代理在第二次插入尝试后看到了 200 条记录的潜力;我不明白为什么它会在第三次重复后看到 300 条记录的潜力——第二次尝试导致 0 行,所以最多可能有 200 个唯一值。
有人可以解释一下吗?