sql - 优化包含多个内连接和聚合函数的 SQL 子查询

Question

我有一个选择语句，它实际上是一个以编程方式构建的更大选择语句中的子查询。问题是如果我选择包含这个子查询，它就会成为一个瓶颈，整个查询会变得非常缓慢。

数据示例如下：

Payment
.Receipt_no|.Person |.Payment_date|.Type|.Reversed| 
          2|John    |01/02/2001   |PA   |         |
          1|John    |01/02/2001   |GX   |         |
          3|David   |15/04/2003   |PA   |         |
          6|Mike    |26/07/2002   |PA   |R        |
          5|John    |01/01/2001   |PA   |         |
          4|Mike    |13/05/2000   |GX   |         |
          8|Mike    |27/11/2004   |PA   |         |
          7|David   |05/12/2003   |PA   |R        |
          9|David   |15/04/2003   |PA   |         |

子查询如下：

select Payment.Person, 
Payment.amount 
from Payment
inner join (Select min([min_Receipt].Person) 'Person',
   min([min_Receipt].Receipt_no) 'Receipt_no' 
   from Payment [min_Receipt] 
   inner join (select min(Person) 'Person', 
      min(Payment_date) 'Payment_date' 
      from Payment
      where Payment.reversed != 'R' and Payment.Type != 'GX' 
      group by Payment.Person) [min_date] 
   on [min_date].Person= [min_Receipt].Person and [min_date].Payment_date = [min_Receipt].Payment_date 
   where [min_Receipt].reversed != 'R' and [min_Receipt].Type != 'GX' 
   group by [min_Receipt].Person) [1stPayment] 
on [1stPayment].Receipt_no = Payment.Receipt_no

这将通过 .Payment_date（升序）、.Receipt_no（升序）检索每个人的第一笔付款，其中 .type 不是“GX”，.Reversed 不是“R”。如下：

Payment
.Receipt_No|.Person|.Payment_date
          5|John   |01/01/2001
          3|David  |15/04/2003
          8|Mike   |27/11/2004

在艾哈迈德的帖子之后 -

从以下结果

(3|David  |15/04/2003) 
and (9|David  |15/04/2003)

我只想要具有最低receipt_no 的记录。所以

(3|David  |15/04/2003)

所以我添加了按人分组的聚合函数'min（Payment.receipt_no）'。

查询 1。

select min(Payment.Person) 'Person',
    min(Payment.receipt_no) 'receipt_no'
from
   Payment a
where
  a.type<>'GX' and (a.reversed not in ('R') or a.reversed is null)
and a.payment_date = 
  (select min(payment_date) from Payment i 
  where i.Person=a.Person and i.type <> 'GX' 
  and (i.reversed not in ('R') or i.reversed is null))
group by a.Person

我在更大的查询中将此添加为子查询，但它仍然运行得很慢。因此，我尝试重写查询，同时尝试避免使用聚合函数并提出以下建议。

查询 2。

SELECT
    receipt_no,
    person,
    payment_date,
    amount
FROM
    payment a
WHERE 
    receipt_no IN 
    (SELECT 
       top 1 i.receipt_no 
    FROM 
        payment i 
    WHERE 
        (i.reversed NOT IN ('R') OR i.reversed IS NULL) 
        AND i.type<>'GX' 
        AND i.person = a.person 
    ORDER BY i.payment_date DESC, i.receipt_no ASC)

我不一定认为它更有效。事实上，如果我在更大的数据集上并排运行两个查询，则查询 1. 在几毫秒内完成，而查询 2. 需要几秒钟。

但是，如果我随后将它们作为子查询添加到更大的查询中，则更大的查询使用查询 1 在几小时内完成。使用查询 2 在 40 秒内完成。

我只能将此归因于在一个而不是另一个中使用聚合函数。

score 1 · Accepted Answer

您如何区分付款

    (3|David  |15/04/2003) 
and (9|David  |15/04/2003)

这些都是由同一人完成的。除非时间不同，否则此查询应该可以正常工作：

select 
    receipt_no,
    person,
    payment_date
from
    payment a
where
    type<>'GX' and (reversed not in ('R') or reversed is null)

  and payment_date = 
     (select min(payment_date) from payment i 
      where i.person=a.person and i.type <> 'GX' 
      and (i.reversed not in ('R') or i.reversed is null))
order by person,payment_date desc

我已经在 SQLFiddle 上设置并测试了这个查询，但我不确定性能，因为我没有你拥有的数据量。所以检查并让我知道

===

上述问题的 SQL Fiddle 演示

score 0 · Accepted Answer

根据 CodeReview 的评论 -

我还按照建议使用Rank()命令重写了查询。

查询 3。

left join 
    (select 
        a.Person, 
        a.amount,
        (rank () over (Partition by a.Person order by a.payment_date desc, a.receipt_no desc)) 'Ranked' 
    from 
        Payment a
    Where 
        (a.reversed not in ('R') or a.reversed is null) 
        and a.type != 'GX'
    ) [lastPayment]  
on 
    [lastPayment].Person = [Person].Person 
    and [lastPayment].ranked = 1

这种方法还可以加快较大的查询，现在较大的查询需要大约 28 秒

但是Rank()仅支持从 SQL 2005 开始。

sql - 优化包含多个内连接和聚合函数的 SQL 子查询

在艾哈迈德的帖子之后 -

2 回答 2

上述问题的 SQL Fiddle 演示

根据 CodeReview 的评论 -

Related

Reference