2

此查询即将01:30运行:

select DATEADD(dd, 0, DATEDIFF(dd, 0, t1.[OccurredOn]))
       , count(t2.UserId)
       , count(*) - count(t2.UserId)
from Events t1
left join (select c.UserId, min(c.OccurredOn) FirstOccurred
           from Events c
           where [OccurredOn] between @start and @end
           group by c.UserId) t2 on t1.OccurredOn = t2.FirstOccurred and t1.UserId = t2.UserId
where t1.EventType = @eventType
    and t1.[OccurredOn] between @start and @end
group by DATEADD(dd, 0, DATEDIFF(dd, 0, t1.[OccurredOn]))
order by DATEADD(dd, 0, DATEDIFF(dd, 0, t1.[OccurredOn]))

如果我WHERE从子查询中删除该子句,它会立即运行。

自行运行子查询,WHERE需要 < 1s

如果我SELECT首先将子查询放入表变量中,然后加入该变量,则整个查询将在 19 秒内运行。

Events表如下所示:

[Events](
    [EventType] [uniqueidentifier] NOT NULL,
    [UserId] [uniqueidentifier] NOT NULL,
    [OccurredOn] [datetime] NOT NULL,
)

我有以下primary, nonclustered, nounique索引:

  • 事件类型
  • 用户身份
  • 发生在

这是执行计划

在此处输入图像描述

我正在使用 SQL Server 2008。

两件事情:

  1. 发生了什么让这变慢?
  2. 我该如何加快速度?
4

2 回答 2

1

您可以尝试替换LEFT JOIN为,LEFT MERGE JOIN以便t2只计算一次派生表,而不是MIN为每个用户重新计算可能多次。

您也可以使用下面的排名函数重写它。它可能更便宜。您需要根据您的数据和索引来测试这些想法。

;WITH T AS
(
SELECT *,
       RANK() OVER (PARTITION BY UserId ORDER BY OccurredOn) AS Rnk
FROM Events
WHERE [OccurredOn] BETWEEN @start AND @end
)
SELECT Dateadd(dd, 0, Datediff(dd, 0, OccurredOn)),
       COUNT(CASE WHEN Rnk =1 THEN 1 END),
       COUNT(CASE WHEN Rnk >1 THEN 1 END)
FROM T
WHERE EventType = @eventType      
GROUP BY Dateadd(dd, 0, Datediff(dd, 0, OccurredOn)) 
ORDER BY Dateadd(dd, 0, Datediff(dd, 0, OccurredOn)) 
于 2012-07-21T11:13:24.460 回答
1

您的查询很慢,因为您的排序取决于动态计算(DATEADD(dd, 0, DATEDIFF(dd, 0, t1.[OccurredOn]))),Sql Server 不能在动态计算中使用索引。

Postgresql对表达式有索引,使用 Postgresql,您基本上可以为您将表达式的结果保存到实际列(幕后列),所以当您需要对该表达式进行排序时,Postgresql 可以使用索引在那个表情上。

Sql Server 中最接近的类似特性是持久化公式。

您可以通过此示例查询轻松验证该功能:

create table PersonX
(
Lastname varchar(50) not null,
Firstname varchar(50) not null
);

create table PersonY
(
Lastname varchar(50) not null,
Firstname varchar(50) not null
);


alter table PersonX add Fullname as Lastname + ', ' + Firstname PERSISTED;    
create index ix_PersonX on PersonX(Fullname);

declare @i int = 0;

while @i < 10000 begin
    insert into PersonX(Lastname,Firstname) values('Lennon','John');
    insert into PersonY(Lastname,Firstname) values('Lennon','John');
    set @i = @i + 1;
end;


select top 1000 Lastname, Firstname
from PersonX
order by Fullname;


select top 1000 Lastname, Firstname
from PersonY
order by Lastname + ', ' + Firstname;

在 PersonX 上对全名下订单比 PersonY 快。PersonX 的查询成本仅为 32%,而 PersonY 为 68%

要解决查询的性能问题,请执行以下操作:

alter table Events 
    add OccurenceGroup as 
        DATEADD(dd, 0, DATEDIFF(dd, 0, [OccurredOn])) PERSISTED

create index ix_Events on Events(OccurenceGroup);

然后在 OccurenceGroup 上进行分组和排序。


顺便说一句,您是否在 OccuredOn 和 EventType 上添加了索引?

于 2012-07-21T09:44:28.483 回答