5
SELECT Trade.TradeId, Trade.Type, Trade.Symbol, Trade.TradeDate, 
       SUM(TradeLine.Notional) / 1000 AS Expr1
FROM   Trade INNER JOIN
             TradeLine ON Trade.TradeId = TradeLine.TradeId
WHERE  (TradeLine.Id IN
                      (SELECT     PairOffId
                        FROM          TradeLine AS TradeLine_1
                        WHERE      (TradeDate <= '2011-05-11')
                        GROUP BY PairOffId
                        HAVING      (SUM(Notional) <> 0)))
GROUP BY Trade.TradeId, Trade.Type, Trade.Symbol, Trade.TradeDate
ORDER BY Trade.Type, Trade.TradeDate

当表开始增长时,我担心 WHERE 子句中的 IN 的性能。有没有人对这种查询有更好的策略?子查询返回的记录数比 TradeLine 表中的记录数增长慢得多。TradeLine 表本身以每天 10 次的速度增长。

谢谢你。

编辑:我使用了将子查询从 WHERE 移动到 FROM 的想法。我对所有有助于这个新查询的答案投了赞成票。

   SELECT Trade.TradeId, Trade.Type, Trade.Symbol, Trade.TradeDate,   
          PairOff.Notional / 1000 AS Expr1
   FROM         Trade INNER JOIN
                  TradeLine ON Trade.TradeId = TradeLine.TradeId INNER JOIN
                      (SELECT     PairOffId, SUM(Notional) AS Notional
                        FROM          TradeLine AS TradeLine_1
                        WHERE      (TradeDate <= '2011-05-11')
                        GROUP BY PairOffId
                   HAVING (SUM(Notional) <> 0)) AS PairOff ON TradeLine.Id = PairOff.PairOffId
   ORDER BY Trade.Type, Trade.TradeDate
4

5 回答 5

6

子句中的子查询IN不依赖于外部查询中的任何内容。您可以安全地将其移入FROM子句;一个理智的查询计划生成器会自动完成。

此外,EXPLAIN PLAN必须调用您将在生产中使用的任何查询。执行此操作并查看 DBMS 对此查询的计划有何看法。

于 2011-05-11T20:54:24.580 回答
2

当子查询开始返回太大的结果集时,我是临时表的粉丝。

所以你的where条款就是

Where TradeLine.Id In (Select PairOffId From #tempResults)

并且#tempResults将被定义为(警告:语法来自内存,这意味着可能存在错误)

Select PairOffId Into #tempResults
From TradeLine
Where (TradeDate <= @TradeDate) 
  //I prefer params in case the query becomes a StoredProc
Group By PairOffId
Having (Sum(Notional) <> 0)
于 2011-05-11T20:49:13.113 回答
1

我有 2 条建议你可以试试:

1)。使用 Exists 因为您不需要从子查询中获取数据,如下所示:

where exists ( select 1 from TradeLine AS TradeLine_1 where TradeLine.Id = TradeLine_1.PairOffId -- 继续你的子查询...)

2)。例如,主查询连接到您的子查询

... 在 your_subquery.PairOffId = TradeLine.Id 上加入 (your_subquery)

我相信这两种方式可以实现比“In”操作更好的性能。

于 2011-05-11T21:10:00.213 回答
1

我在 XXXXXX DB 中遇到了数十万条记录的相同问题。在我的代码中,我想从所有节点中检索层次结构(包含至少一个子节点的节点)节点。

在此处输入图像描述

编写的初始查询非常慢。

  SELECT SUPPLIER_ID, PARENT_SUPPLIER_ID,
  FROM SUPPLIER
  WHERE 
    SUPPLIER_ID != PARENT_SUPPLIER_ID
    OR 
    SUPPLIER_ID   IN
      (SELECT DISTINCT PARENT_SUPPLIER_ID
       FROM SUPPLIER
       WHERE SUPPLIER_ID != PARENT_SUPPLIER_ID
      );

然后重写为

  SELECT a.SUPPLIER_ID, a.PARENT_SUPPLIER_ID,
  FROM SUPPLIER a
  LEFT JOIN
  (SELECT DISTINCT PARENT_SUPPLIER_ID
  FROM SUPPLIER
  WHERE SUPPLIER_ID != PARENT_SUPPLIER_ID
  ) b
  ON a. SUPPLIER_ID     = b.PARENT_SUPPLIER_ID
  WHERE a. SUPPLIER_ID != a.PARENT_SUPPLIER_ID
     OR a. SUPPLIER_ID     = b.PARENT_SUPPLIER_ID;
于 2015-12-09T13:32:21.963 回答
-1

使用 IN 本质上会强制您进行表扫描。当你的表增长时,你的执行时间也会增长。此外,您正在为返回的每条记录运行该查询。将标量选择用作表格会更容易:

SELECT t.TradeId, t.Type, t.Symbol, t.TradeDate, 
       SUM(TradeLine.Notional) / 1000 AS Expr1
FROM   Trade t,
(SELECT     TradeId, PairOffID
                        FROM          TradeLine AS TradeLine_1
                        WHERE      (TradeDate <= '2011-05-11')
                        GROUP BY PairOffId
                        HAVING      (SUM(Notional) <> 0)) tl       
WHERE  t.TradeId = tl.TradeId
  and  t.id <> tl.PairOffID
GROUP BY Trade.TradeId, Trade.Type, Trade.Symbol, Trade.TradeDate
ORDER BY Trade.Type, Trade.TradeDate
于 2011-05-11T20:54:10.353 回答