I have the following views defined:
dsplit_base - a union of 4 queries each of which is a simple join between fact and mapping tables (contains call statistics); it consists of 201 columns
calls_check - a view derived from the dsplit_base meant to be used in data consistency check. Here is the definition:
SELECT a.Brand,
a.[Call Center] ,
c.date,
c.weekday,
COUNT(*) vol,
cast((COUNT(*)-g.vol) AS real)/g.vol*100 vol_diff ,
SUM(abncalls+acdcalls) calls ,
CASE
WHEN g.calls<>0 THEN cast((SUM(abncalls+acdcalls)-g.calls) AS real)/g.calls*100
ELSE CASE
WHEN SUM(abncalls+acdcalls)<>0 THEN 100
ELSE 0
END
END calls_diff
FROM dsplit_base a
JOIN calendar c ON a.ROW_DATE=c.date
JOIN
( SELECT t.Brand,
t.[Call Center],
c.weekday,
avg(cast(vol AS bigint)) vol,
AVG(cast(calls AS bigint)) calls
FROM
( SELECT Brand,
[Call Center], row_date, COUNT(*) vol, SUM(abncalls+acdcalls) calls from dsplit_base group by ROW_DATE, [Call Center],
Brand ) t
JOIN calendar c ON t.row_date=c.date
GROUP BY c.weekday,
t.[Call Center],
t.Brand) g ON c.weekday=g.weekday
AND a.Brand=g.Brand
AND a.[Call Center]=g.[Call Center]
GROUP BY c.date,
c.weekday,
g.vol,
g.calls,
a.[Call Center],
a.Brand
The following query yields around 16000 rows in 1-3 seconds:
select * from calls_check
Brand Call Center date weekday vol vol_diff calls calls_diff
LMN Munich 2008-01-24 Thursday 3 -25 470 8.796296
LMN Munich 2008-04-26 Saturday 3 0 352 51.72414
...
Now the actual problem I encountered is when I tried to pull out results for limited period of time. By adding where clause as follows the query will not finish (surely not in ~10 minutes):
select * from calls_check
where date >= DATEADD(d, -8, sysdatetime())
And, what is maybe even weirder, this query executes successfully in a second!
select * from calls_check
where date < DATEADD(d, -8, sysdatetime())
Can anybody tell why comparison operator in where clause makes such a difference? Why < seems to very efficiently slice the result set while > or = makes the query unresponsive?
Some additional info:
The dsplit_base view consists of 4 tables union (with joins). Here are their row counts:
dsplit_DE - 2521
dsplit_WNS - 7243
dsplit_US - 121451
partners - 377841 (166043)
actual 'partners' table row count is 166043 because in the view it takes rows on this condition:
from partners p join splitdim s
ON p.[Skill Name]=s.SPLITNAME and (p.Date>=s.[start_date] or s.[start_date] is null) and (p.DATE<=s.[end_date] or s.[end_date] is null)
where s.[Call center] IN ('Sitel', 'TRX', 'Sellbytel')
OR (s.[Call center]='WNS' and p.Date<(select MIN(row_Date) from dsplit_WNS))
OR (s.[Call Center]='Munich' and (p.Date<'2012-06-29' or p.Date between '2012-08-01' and '2012-08-27'))
I experimented with modified view definition and found out that:
having the view with dsplit_DE and/or dsplit_WNS only both queries work pretty fast (1-2 seconds)
with partners only the '>=' query took ~30s ; with dsplit_US only it took ~60s
here is the actual execution plan of the latter EXEC PLAN
The last two table are much bigger than others yet with a few hundred thousands of records it should not take so long. What causes the difference in execution time depenending on '<' or '>' operator used in where clause?