In our production system (SQL Server 2008 / R2) there is a table in which generated documents are stored.
The documents have a reference (varchar
) and a sequence_nr (int
). The document may be generated multiple times and each iteration gets saved in this table incrementing the sequence number. Additionally each record has a data column (varbinary
) and a timestamp as well as a user tag.
The only reason to query this table is for auditing purposes later on and during inserts.
The primary key for the table is clustered over the reference
and sequence_nr
columns.
As you can probably guess generation of documents and thus the data in the table (since a document can be generated again at a later time) does not grow in order.
I realized this after inserts in the table started timing out.
The inserts are performed with a stored procedure. The stored procedure determines the current max sequence_nr
for the given reference and inserts the new row with the next sequence_nr
.
I am fairly sure a poor choice of clustered index is causing the timeout problems, since records will be inserted for already existing references, only with a different sequence_nr and thus may end up anywhere in the record collection, but most likely not at the end.
On to my question: would it be better to go for a non-clustered index as primary key or would it be better to introduce an identity column, make it a clustered primary key and keep an index for the combination of reference and sequence_nr
?
Knowing that for the time being (and not at all as far as we can foresee) there is no need to query this table intensively, except for the case where a new sequence_nr
must be determined.
Edit in answer to questions: Tbh, I'm not sure about the timeout in the production environment. I do know that new documents get added in parallel running processes.
Table:
CREATE TABLE [dbo].[tbl_document] (
[reference] VARCHAR(50) NOT NULL,
[sequence_nr] INT NOT NULL,
[creation_date] DATETIME2 NOT NULL,
[creation_user] NVARCHAR (50) NOT NULL,
[document_data] VARBINARY(MAX) NOT NULL
);
Primary Key:
ALTER TABLE [dbo].[tbl_document]
ADD CONSTRAINT [PK_tbl_document] PRIMARY KEY CLUSTERED ([reference] ASC, [sequence_nr] ASC)
WITH (ALLOW_PAGE_LOCKS = ON, ALLOW_ROW_LOCKS = ON, PAD_INDEX = OFF, IGNORE_DUP_KEY = OFF, STATISTICS_NORECOMPUTE = OFF);
Stored procedure:
CREATE PROCEDURE [dbo].[usp_save_document] @reference NVARCHAR (50),
@sequence_nr INT OUTPUT,
@creation_date DATETIME2,
@creation_user NVARCHAR(50),
@document_data VARBINARY(max)
AS
BEGIN
SET NOCOUNT ON;
DECLARE @current_sequence_nr INT
SELECT @current_sequence_nr = max(sequence_nr)
FROM [dbo].[tbl_document]
WHERE [reference] = @reference
IF @current_sequence_nr IS NULL
BEGIN
SELECT @sequence_nr = 1
END
ELSE
BEGIN
SELECT @sequence_nr = @current_sequence_nr + 1
END
INSERT INTO [dbo].[tbl_document]
([reference],
[sequence_nr],
[creation_date],
[creation_user],
[document_data])
VALUES (@reference,
@sequence_nr,
@creation_date,
@creation_user,
@document_data)
END
Hope that helps.