2

我有一个数据库,其中只有一个Logs包含列的表:

  • Id(PK 集群,int,不为空),
  • ServiceName(nvarchar(255), not null) 和其他一些列,如
  • TaskVariant(nvarchar(1024)),
  • Source(nvarchar(1024))。

我在列上创建了一个索引INDEX_SERVICENAME(非唯一,非聚集)ServiceName,其中包括除Id, ServiceName.

  • 数据库大小约为 4 GB。
  • 表包含大约 3 500 000 行。
  • 表包含大约 1 400 000 行 Source = N'IpJob'。
  • 表包含大约 2 400 000 行,TaskVariant = N'Ip'。
  • 表包含大约 600 000 行,ServiceName = '1' 和 TaskVariant = N'Ip'。
  • 表包含大约 350 000 行,ServiceName = '1' 和 Source = N'IpJob'。

问题:

我想通过分页或分页从表过滤中选择ServiceName所有TaskVariantSource。我的原始查询是用于选择最后 100 个过滤项Source

SELECT TOP (100) 
[Filter1].[Id] AS [Id], 
[Filter1].[Date] AS [Date], 
[Filter1].[Data] AS [Data], 
[Filter1].[ServiceName] AS [ServiceName], 
[Filter1].[LogLevel] AS [LogLevel], 
[Filter1].[StackTrace] AS [StackTrace], 
[Filter1].[TaskVariant] AS [TaskVariant], 
[Filter1].[Source] AS [Source], 
[Filter1].[Message] AS [Message]
FROM ( SELECT [Extent1].[Id] AS [Id], [Extent1].[Date] AS [Date], [Extent1].[Data] AS [Data], [Extent1].[ServiceName] AS [ServiceName], [Extent1].[LogLevel] AS [LogLevel], [Extent1].[StackTrace] AS [StackTrace], [Extent1].[TaskVariant] AS [TaskVariant], [Extent1].[Source] AS [Source], [Extent1].[Message] AS [Message], row_number() OVER (ORDER BY [Extent1].[Id] DESC) AS [row_number]
    FROM [dbo].[Logs] AS [Extent1]
    WHERE (@serviceName = [Extent1].[ServiceName]) AND (@source = [Extent1].[Source])
)  AS [Filter1]
WHERE [Filter1].[row_number] > 0
ORDER BY [Filter1].[Id] DESC

此查询的工作速度非常快~ 00:00:00 时间。

但是当我尝试按TaskVariant查询过滤时,大约需要 00:02:18 分钟(下一个查询)。

SELECT TOP (100) 
[Filter1].[Id] AS [Id], 
[Filter1].[Date] AS [Date], 
[Filter1].[Data] AS [Data], 
[Filter1].[ServiceName] AS [ServiceName], 
[Filter1].[LogLevel] AS [LogLevel], 
[Filter1].[StackTrace] AS [StackTrace], 
[Filter1].[TaskVariant] AS [TaskVariant], 
[Filter1].[Source] AS [Source], 
[Filter1].[Message] AS [Message]
FROM ( SELECT [Extent1].[Id] AS [Id], [Extent1].[Date] AS [Date], [Extent1].[Data] AS [Data], [Extent1].[ServiceName] AS [ServiceName], [Extent1].[LogLevel] AS [LogLevel], [Extent1].[StackTrace] AS [StackTrace], [Extent1].[TaskVariant] AS [TaskVariant], [Extent1].[Source] AS [Source], [Extent1].[Message] AS [Message], row_number() OVER (ORDER BY [Extent1].[Id] DESC) AS [row_number]
    FROM [dbo].[Logs] AS [Extent1]
    WHERE (@serviceName = [Extent1].[ServiceName]) AND (@taskVariant = [Extent1].[TaskVariant])
)  AS [Filter1]
WHERE [Filter1].[row_number] > 0
ORDER BY [Filter1].[Id] DESC

问题:为什么第二个查询执行这么慢,如何解决这个问题?

非常感谢您的建议。

执行计划1

4

2 回答 2

1

索引的工作方式类似于层次结构/树,其级别对应于其中的列。

因此,如果您的索引是 on ServiceName, TaskVariant,您可以快速过滤到特定ServiceName的 s,因为那是树中的顶层。

但是,如果您尝试按 过滤TaskVariant,您现在必须通读整个索引:您不能只跳转到特定TaskVariant的,因为相同的TaskVariant将在不同ServiceName的 s 下。

如果要对 TaskVariant 进行过滤,则需要另一个以 .开头的索引TaskVariant。注意:不要只在每一列上创建完整的索引:每个索引都会占用额外的空间,并且需要在UPDATEs 和INSERTs上做更多的工作

于 2013-08-15T16:13:04.660 回答
0

您看到的执行时间差异主要是由于第一个有索引而第二个没有。至于为什么会有这么大的差异,很可能是因为有索引,就意味着值是排序的。

由于对值进行了排序,因此您可以使用非常有效的字符串搜索算法,该算法可以使过滤数量级时的操作次数更小。

此外,还有许多其他特征会影响这一点。可能整个索引都在内存中,而表数据不在,因此第一个查询中的过滤可以全部在内存上完成,并且永远不会触及磁盘,而另一个可能不会。

于 2013-08-15T16:36:37.253 回答