1

有两张表,一张是empl545405条记录,第二张是pam1466320条记录。任务是根据aID找到pID的计数。因此,为了完成这项任务,我编写了以下查询。

Select pa.aID, count(pa.pID) from 
empl join pam pa
ON empl.pID = pa.pID
Group by pa.aID

pam的索引如下:

IX_pam_Unique   nonclustered, unique, unique key located on PRIMARY     pID, aID
IX_pam_aID      nonclustered located on PRIMARY                         aID
PK_paID         clustered, unique, primary key located on PRIMARY       paID

实际执行计划显示索引扫描:

在此处输入图像描述 在此处输入图像描述

我能理解的是,估计数据大小为15 MB,这导致了问题。

有没有办法针对大量数据调整这个复杂的计数查询?

编辑:

带有 empl 过滤器的查询:

Select pa.aID, count(pa.pID) from 
empl join pam pa
ON empl.pID = pa.pID
where 
empl.del = 0 AND 
empl.pub = 1 AND 
empl.sID = 2 AND 
empl.md = 0          
Group by pa.aID

结构中没有什么花哨的,只使用了基本的数据类型 int、bit、varchar 和 datatime。empl65列, pam4

4

3 回答 3

1

可能这对您有帮助-

SELECT pa.*
FROM empl
JOIN (
    SELECT 
          pa.aID
        , cnt = COUNT(pa.pid)
    FROM pam pa 
    GROUP BY pa.aID
) pa ON empl.pid = pa.pid

或这个 -

SELECT pa.aID, COUNT(pa.pid)
FROM pam pa
WHERE EXISTS(
    SELECT 1 
    FROM empl 
    WHERE empl.pid = pa.pid
)
GROUP BY pa.aID

甚至这个——

SELECT 
      pa.aID
    , cnt = COUNT(pa.pid)
FROM pam pa 
GROUP BY pa.aID
于 2013-08-20T07:24:55.030 回答
1

保持查询原样,添加一个empl仅包含 del、pub、sid、md 和 pid 列的索引。确保 pid 是索引中的最后一列。

编辑:尝试的替代查询可能是

SELECT DISTINCT pa.aID, COUNT(pa.pID) OVER (PARTITION BY pa.aID) AS cnt
FROM empl JOIN pam pa
ON empl.pID = pa.pID
WHERE
empl.del = 0 AND 
empl.pub = 1 AND 
empl.sID = 2 AND 
empl.md = 0

注意这个不需要GROUP BY. 不确定它会有多快/慢。查询计划将与GROUP BY虽然不同。

编辑:你是对的。我已经添加DISTINCT

于 2013-08-20T08:29:32.013 回答
0

对于静态条件,您可以尝试索引视图:

create view vPamCnt
with schemabinding
as
Select pa.aID, count_big(*) cnt
from dbo.pam pa
    join dbo.empl empl ON empl.pID = pa.pID
where 
    empl.del = 0 AND
    empl.pub = 1 AND
    empl.sID = 2 AND
    empl.md = 0
Group by pa.aID
GO
create unique clustered index CI_vPamCnt on vPamCnt (aID)
GO

并将您的查询更改为:

select aID, cast(cnt as int)
from vPamCnt
于 2013-08-21T11:48:00.230 回答