我有一个以键值对格式存储动态用户数据的表。像这样的东西:
UserId | Key | Value
---------------------------------
1 | gender | male
1 | country | Australia
2 | gender | male
2 | country | US
3 | gender | female
3 | country | Spain
现在,我需要选择具有某些参数的用户,例如:gender is 'male' AND country 是 'US'。或更笼统地说:
key1=value1 AND key2=value2 AND key3=value3 AND ...
为此,我发现最快的方法是执行以下操作:
WHERE key=(key1)
AND value=value1
AND EXISTS(SELECT 1
FROM (...)
WHERE key=key2
AND value=value2)
AND EXISTS(SELECT 1
FROM (...)
WHERE key=key3
AND value=value3)
AND EXISTS(...)
在这种情况下,如果第一个 WHERE 过滤器用于值更均匀和隔离的过滤器,我会得到最好的结果。
例如,“性别”可以有 99% 的男性和 1% 的女性,国家可以将整个人口划分为 100 个相似的部分。在这种情况下,我需要先按国家/地区过滤并使用 EXIST 作为性别条件。
问题: SQL Server 2008 R2 中是否有任何方法可以获取索引统计信息以查找哪个子句最好放在第一位(基本上不在 EXISTS 中)?
替代问题:我认为这是最好的方法,但是将查询重写为始终最佳的方法也可以是解决方案。
解决方案信息:
正确的解决方案是下面@usr 解释的解决方案(使用INTERSECT
)。实际上,似乎我做错了什么,EXISTS
引擎也正确解决了问题。为了提供更多信息,我将分享 IO 和 TIME 统计信息以及测试选项的执行计划:
使用INTERSECT
:
Table 'PERFTEST'. Scan count 2, logical reads 113, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 2 ms.
使用EXISTS
:
Table 'PERFTEST'. Scan count 2, logical reads 113, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 3 ms.
(注意额外的Stream Aggregate
步骤)
使用INNER JOIN
:
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'PERFTEST'. Scan count 2, logical reads 113, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 31 ms, elapsed time = 25 ms.
结论:
INTERSECT
在这种情况下稍快一点EXISTS
。该INNER JOIN
选项要慢得多。