sql-server-2005 - 非索引数据的最快 SQL 查询

Question

我正在针对 SQL Server 2005 数据库构建一些自定义报告。该数据库属于我们运行的第 3 方管理应用程序。我要提取的数据不是该站点的主要用途，因此除了时间戳列之外，这些数据基本上没有被索引。目前，只涉及一个表——大约 7 亿行的表。因此，当我对它运行一个应该只返回 50 行的查询时，它必须轮询所有 7 亿行。

我希望加快速度，但不想索引我添加到 WHERE 子句的每一列——我不知道添加这么多索引最终会大大提高速度（或者我是错误的？）。所以我很好奇如果我不能向表中添加任何新索引，最好的做法是什么！

存储过程似乎不是最合适的。索引视图可能是最好的主意？想法？

这是表架构：

DeviceGuid (PK, uniqueidentifier, not null)
DeviceID (int, not null)
WindowsEventID (PK, int, not null) (indexed)
EventLog (varchar(64), not null)
EventSource (varchar(64), not null)
EventID (int, not null)
Severity (int, not null)
Description (nvarchar(max), not null)
TimeOfEvent (PK, datetime, not null) (indexed)
OccurrenceNbr (int, not null)

这是一个示例查询：

SELECT COUNT(*) AS NumOcc, EventID, EventLog, EventSource, Severity, TimeOfEvent, Description
FROM WindowsEvent
WHERE DeviceID='34818'
    AND Severity=1
    AND TimeOfEvent >= DATEADD(hh, DATEDIFF(hh, GETDATE(), GETUTCDATE()), '2010/10/27 12:00:00 AM')
    AND TimeOfEvent <= DATEADD(hh, DATEDIFF(hh, GETDATE(), GETUTCDATE()), '2010/11/3 12:00:00 AM')
    AND EventID<>34113
    AND EventID<>34114
    AND EventID<>34112
    AND EventID<>57755
    AND EventSource<>'AutoImportSvc.exe'
    AND EventLog='Application'
GROUP BY EventID, EventLog, EventSource, Severity, Description
ORDER BY NumOcc DESC

也许查询很糟糕......它在 4.5 分钟内返回 53 行。

score 0 · Accepted Answer

如果您的查询没有使用任何索引，那将是非常糟糕的。您不需要在每一列上都有一个索引，但您需要在右列上一个。鉴于 TimeOfEvent 已被编入索引，它可能不适合您的需求。

右列将取决于数据的分布。最好的索引可能是提供最高选择性的索引（即，当您知道索引的键值时，它返回的行数最少）。如果您知道提供最佳选择性的列，则可以尝试对其进行索引。

为了帮助确定最佳索引，您可以使用 SSMS 中的显示估计执行计划。这将帮助您查看将使用哪个索引。添加索引后，您可以运行查询并使用执行计划评估结果。当然，观察经过的时间也会有所帮助。

score 0 · Accepted Answer

使用双 row_number 技巧尝试此方法：

SELECT  RN_Desc as NumOcc, *
FROM    (
        SELECT  row_number() Over(partition by EventId order by EventLog, EventSource, Severity, Description) as RN_Asc,
                row_number() Over(partition by EventId order by EventLog desc, EventSource desc, Severity desc, Description desc) as RN_Desc,
                *
        FROM    WindowsEvent 
        WHERE   DeviceID='34818' 
                AND Severity=1 
                AND TimeOfEvent >= DATEADD(hh, DATEDIFF(hh, GETDATE(), GETUTCDATE()), '2010/10/27 12:00:00 AM') 
                AND TimeOfEvent <= DATEADD(hh, DATEDIFF(hh, GETDATE(), GETUTCDATE()), '2010/11/3 12:00:00 AM') 
                AND EventID<>34113 
                AND EventID<>34114 
                AND EventID<>34112 
                AND EventID<>57755 
                AND EventSource<>'AutoImportSvc.exe' 
                AND EventLog='Application' 
        ) t
WHERE   RN_Asc = 1 
ORDER BY NumOcc DESC

有了这个，引擎不需要做任何聚合，只需通过表一次。如果它不起作用，请尝试按部分行号进行排序和分区以获得正确的分组。

score 0 · Accepted Answer

The final solution here was to run a query against the indexed fields, then filter them within the application running the query. Two fields ended up containing similar enough information that I could query against one index and get a very close approximation of the data I wanted. I looped back through and removed any non-matching entities from the result list. Took MUCH less time!

score 0 · Accepted Answer

0

This is pretty simpleminded, but I'd try the indexed value as the first test in the

于 2016-07-14T22:34:53.650 回答

sql-server-2005 - 非索引数据的最快 SQL 查询

4 回答 4

Related

Reference