我有一个名为 Transactions 的表,它目前包含 6+ 百万行(每月大约 600-700千)它看起来像这样:
pk id acct_id id1 id2 id3 id4 created interface_id source_lvl1 source_lvl2 trans_type
------------------------------------------------------------ ----------- ----------- ----------- ----------- ----------- ----------- ----------------------- ------------ ----------- ----------- -----------
10000257.4297...400245990.3.1002 10000257 4297 NULL NULL NULL NULL 2012-09-06 11:26:30.000 1 32002 1002 3
10004819.1529.106.105442.400667675.6.1021 10004819 1529 106 105442 62 NULL 2012-09-11 08:34:35.000 4 32002 1021 6
10004819.1529.18664647.62.400667675.3.1021 10004819 1529 18664647 62 NULL NULL 2012-09-11 08:34:35.000 4 32002 1021 3
10006460.1529.106.105442.400667675.6.1021 10006460 1529 106 105442 62 NULL 2012-09-11 08:34:35.000 4 32002 1021 6
10006460.1529.18664647.62.400667675.3.1021 10006460 1529 18664647 62 NULL NULL 2012-09-11 08:34:35.000 4 32002 1021 3
10006648.3280...406204785.3.1002 10006648 3280 NULL NULL NULL NULL 2012-11-14 10:39:45.000 6 32002 1002 3
10006834.1529.106.105442.400667675.6.1021 10006834 1529 106 105442 62 NULL 2012-09-11 08:34:35.000 4 32002 1021 6
10006834.1529.18664647.62.400667675.3.1021 10006834 1529 18664647 62 NULL NULL 2012-09-11 08:34:35.000 4 32002 1021 3
10006962.2428...415795811.3.1018 10006962 2428 NULL NULL NULL NULL 2013-03-05 10:50:11.000 1 32002 1018 3
10006962.2428.107972..415795811.4.1018 10006962 2428 107972 NULL NULL NULL 2013-03-05 10:50:11.000 1 32002 1018 4
我已经定义了一个视图,它应该有助于计算特定事件:
这是sql定义:
CREATE VIEW [dbo].[Queue_base]
AS
select
dateadd(minute , (DATEPART(minute,t.created)/30)*30 , DATEADD(hour,datediff(hour, 0, t.created), 0)) INTRVL_UTC,
dateadd(minute , (DATEPART(minute,t.created)/30)*30 + 30 , DATEADD(hour,datediff(hour, 0, t.created), 0)) INTRVL_END_UTC,
a.ID [Agent ID], a.Login, a.DisplayName, a.GroupName, q.QueueID, q.QueueName,
TODATETIMEOFFSET(t.created,0) created
,i.ReferenceNumber, t.id inc_id
, case when (t.trans_type=17 and t.source_lvl2 not IN (1001, 2001)) or (t.trans_type=6 and t.id1=8) then t.id else null end [Workload]
, case when (t.trans_type=6 and t.id1=8 and t.source_lvl2 not IN (1001, 2001) or (t.trans_type=17 and not t.source_lvl2 IN (1001,2001)))then t.id else null end [Inbound Emails]
, case when t.trans_type=17 and t.id1=q.QueueID then t.id else null end [EnQueued]
, case when t.trans_type=17 and t.id2=q.QueueID then t.id else null end [DeQueued]
, case when t.trans_type=6 and t.id1 IN (2,106) then t.id else null end [Solved]
, case when t.trans_type=6 and t.id1 =8 then t.id else null end [Updated]
, case when x.StatusTypeID = 2 then t.id else null end [Reopened]
, case when t.trans_type=6 and t.id1=125 then t.id else null end [Spam]
, case when t.trans_type=8 and t.acct_id <> 1 then t.id else null end [Responded]
, case when i.cr_rec_element_1 is not null or i.de_reason1 is not null then t.id else null end [Complaint]
,t.trans_type, t.id1
,r.Brand, r.Region, r.[Call Center], r.LOB, r.[LOB Detail], r.Team, r.Subteam, r.Channel
,r.Interface, r.Product, r.[Product Detail], r.Unit
from Transactions t
left join
(
select a.*, b.id1, st.StatusTypeID
from
(select
t1.pk, t1.id, t1.created, max(t2.created) maxdate
from Transactions t1
left join Transactions t2
on t1.id=t2.id and t2.created<t1.created and t2.trans_type=6
left join Status st on t2.id1=st.StatusID
where t1.trans_type=6 and t1.id1=8
group by t1.pk, t1.id, t1.created) a left join Transactions b on a.id=b.id and b.created=a.maxdate and b.trans_type=6
left join Status st on b.id1=st.statusid
)
x on t.pk=x.pk
left join Incident i on t.id=i.id
left join Account a on t.acct_id=a.ID
left join Queue q ON (t.trans_type=17 and (t.id1=q.QueueID or t.id2=q.QueueID) or t.trans_type IN (6,8) and t.id3=q.QueueID)
left join queuedim r ON (q.QueueName=r.QueueName or q.QueueName is null and r.QueueName is null)
and (q.QueueID=r.QueueID or q.QueueID is null and r.QueueID is null)
where t.trans_type=17 or t.trans_type IN (6,8)
这是视图的关键部分:
inc_id Workload Inbound Emails EnQueued DeQueued Solved Updated Reopened Spam Responded Complaint
----------- ----------- -------------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- -----------
10209648 NULL NULL NULL NULL 10209648 NULL NULL NULL NULL NULL
10209648 NULL NULL NULL NULL NULL NULL NULL NULL 10209648 NULL
10209648 10209648 NULL NULL NULL NULL 10209648 NULL NULL NULL NULL
10227966 NULL NULL NULL NULL NULL NULL NULL 10227966 NULL NULL
10288343 NULL NULL NULL NULL 10288343 NULL NULL NULL NULL NULL
10303898 NULL NULL NULL NULL 10303898 NULL NULL NULL NULL NULL
10394204 NULL NULL NULL NULL NULL NULL NULL 10394204 NULL NULL
10409624 NULL NULL NULL NULL 10409624 NULL NULL NULL NULL NULL
10482071 NULL NULL NULL NULL NULL NULL NULL 10482071 NULL NULL
10485993 NULL NULL NULL NULL NULL NULL NULL 10485993 NULL NULL
我的计划是创建另一个表并使用我感兴趣的汇总结果连续更新它,按日期期间和其他维度的组合进行分组。问题是我需要对上面描述的事件进行不同和简单的计数,但是,虽然后一种视图很快产生了原始结果,但另一个带有计数的查询需要很长时间:
-- month account
declare @d1 date
declare @d2 date
set @d1 = '2013-05-01'
set @d2 = '2013-06-01'
--insert into IncPerfQueue
select x.Brand, x.Region, x.[Call Center], x.LOB, x.[LOB Detail], x.Team, x.Subteam,
x.QueueName, case when x.[Agent ID] is null then 0 else [Agent ID] end, c.[month], NULL weekstart, NULL [date]
, count(distinct EnQueued) [Distinct Incidents EnQueued]
, count(distinct DeQueued) [Distinct Incidents DeQueued]
, count(distinct Solved) [Distinct Incidents Solved in the queue]
, COUNT(distinct Responded) [Distinct Incidents Responded in the queue]
, COUNT(distinct Updated) [Distinct Incidents Updated in the queue]
, count(distinct Reopened) [Distinct Incidents ReOpened in the queue]
, count(distinct Spam) [Distinct Spam closed in the queue]
, COUNT([Inbound Emails]) [Inbound Emails]
, COUNT(Workload) [Workload]
, count(EnQueued) [# EnQueued]
, count(DeQueued) [# DeQueued]
, count(Solved) [# Solved in the queue]
, COUNT(Responded) [# Responded in the queue]
, COUNT(Updated) [#Updated in the queue]
, count(Reopened) [# ReOpened in the queue]
, count(Spam) [# Spam closed in the queue]
from Queue_base x
join [calendar] c ON convert(date,x.created)=c.date
where x.created >= @d1 and x.created < @d2
and Brand is not null
group by x.Brand, x.Region, x.[Call Center], x.LOB, x.[LOB Detail], x.Team, x.Subteam,
x.QueueName, [Agent ID], c.month
这只是所需的查询之一,因为需要针对不同维度进行单独的聚合(每个分组的计数不同),并且花费了 1 多个小时! http://i.stack.imgur.com/oWibJ.png
我将感谢您就此类查询中最好的方法提出建议。基表肯定会很快变得更大……我应该对它进行分区吗?我还应该注意,这里引用的所有表都已编入索引并且我正在使用:Microsoft SQL Server 2008 R2 (SP2)(X64) 安装在配备 2 个 X5550 处理器和 48GB RAM 的机器上,操作系统是 Windows Server 2008 R2 Enterprise。
谢谢,马朱