sql-server - Order of columns in index - insert performance

Question

I've done a bit of research on the order of index columns but am not 100% sure so please bear with me on this! I have the following table:

CREATE TABLE [Valuation]
    (  
      [ValuationID] [int] IDENTITY(1, 1)
                          NOT NULL
                          CONSTRAINT [PK_Valuation] PRIMARY KEY ,  
      VersionID INT NOT NULL ,  
      AlphanumericIdentifier VARCHAR(255) NOT NULL,  
      ...  
      other columns  
      ...  
    )

I do many joins on this table to others on VersionID and AlphanumericIdentifier, so I put an index on it:

CREATE NONCLUSTERED INDEX [IX_Valuation] ON [dbo].[Valuation]   
(
[VersionID] ASC,
[AlphanumericIdentifier] ASC
)

Two questions:

These joins are usually done for a specific VersionID, so this is the most selective column and should be the first in the index - correct?
Inserts are always done for a single version, which is 1 more than the last version. This should mitigate the performance hit on inserts as the inserted rows are a 'chunk' that can be added to the end of the index. Is this right?

I'm pretty sure that I'm right on 1, but is 2 correct?

Thanks Joe

score 1 · Accepted Answer

对于您的问题：

“这些连接通常针对特定的 VersionID 完成，因此这是最具选择性的列，应该是索引中的第一个。”

连接与它无关，除非连接被用作过滤器。过滤器（Where 子句谓词）和排序（Order By 子句）使用索引。是否使用索引取决于有多少记录（行）满足过滤器。如果查询将返回表中的每一行（没有 where 子句），那么很可能不会使用索引，因为查询优化器会（正确地）决定它可能只读取整个表而不是尝试使用一个索引。索引是分层的树结构，具有多个级别。对于查询将返回的每一行，使用索引需要每个索引级别一个磁盘 I/O。因此，如果查询将返回表中的所有 1000 行，并且索引中有五个级别，那么这将需要 5000 次 IO。直接从表中读取数据，而不是索引，

接下来，您关于“这应该减轻对插入的性能影响的陈述，因为插入的行是一个可以添加到索引末尾的'块'”

仅当索引是聚集索引时，此语句才成立。在您的架构中，聚集索引是主键（因为虽然您可以覆盖它，但这是默认行为），它是 on ValuationID，而不是 on VersionId。因此，任何记录的“块”插入，无论它们是否都具有相同的 versionId，都将添加到索引的末尾，因为它们都将具有 new valuationIds。

score 0 · Accepted Answer

是的，你在这两个方面都是对的。

这些列应根据您查询较多的列进行排序，前导列是您经常或最常查询的列。

添加具有递增值的行VersionID意味着不需要拆分中间页面。

sql-server - Order of columns in index - insert performance

2 回答 2

Related

Reference