I have two tables -
Table A : 1MM rows, AsOfDate, Id, BId (foreign key to table B)
Table B : 50k rows, Id, Flag, ValidFrom, ValidTo
Table A contains multiple records per day between 2011/01/01 and 2011/12/31 across 100 BId's. Table B contains multiple non overlapping (between validfrom and validto) records for 100 Bids.
The task of the join will be to return the flag that was active for the BId on the given AsOfDate.
select
a.AsOfDate, b.Flag
from
A a inner Join B b on
a.BId = b.BId and b.ValidFrom <= a.AsOfDate and b.ValidTo >= a.AsOfDate
where
a.AsOfDate >= 20110101 and a.AsOfDate <= 20111231
This query takes ~70 seconds on a very high end server (+3Ghz) with 64Gb of memory.
I have indexes on every combination of field as I'm testing this - to no avail.
Indexes : a.AsOfDate, a.AsOfDate+a.bId, a.bid Indexes : b.bid, b.bid+b.validfrom
Also tried the range queries suggested below (62seconds)
This same query on the free version of Sql Server running in a VM takes ~1 second to complete.
any ideas?
Postgres 9.2
Query Plan
QUERY PLAN
---------------------------------------------------------------------------------------
Aggregate (cost=8274298.83..8274298.84 rows=1 width=0)
-> Hash Join (cost=1692.25..8137039.36 rows=54903787 width=0)
Hash Cond: (a.bid = b.bid)
Join Filter: ((b.validfrom <= a.asofdate) AND (b.validto >= a.asofdate))
-> Seq Scan on "A" a (cost=0.00..37727.00 rows=986467 width=12)
Filter: ((asofdate > 20110101) AND (asofdate < 20111231))
-> Hash (cost=821.00..821.00 rows=50100 width=12)
-> Seq Scan on "B" b (cost=0.00..821.00 rows=50100 width=12)
see http://explain.depesz.com/s/1c5 for the analyze output