是的,所以我有这个笨重的查询需要优化,我将它修剪了很多,以使其更具可读性,同时仍然可以理解这一点。
我基本上看到在顶级查询和所有三个子查询中都存在相同的“分组依据”逻辑,这些列也是“内部连接”逻辑的参数。问题是,我不确定如何优化它,尽管我可以想象必须有一些更简单的方法来实现相同的结果。
涉及的表:Invoice、InvoiceLine、ProductType
InvoiceLine 通过外键与 Invoice 和 ProductType 相关联
此查询应该将单个发票的总和 invoiceline.click 与所有其他发票的总和 invoiceline.click 进行比较,按 producttype.name 和 invoiceline.origin 分组,另外还按 invoice.final 拆分。所以结果应该是这样的:
产品类型 | 产地 | 点击参考发票 | 点击所有其他最终发票 | 点击所有其他未完成的发票
让我强调一下,查询确实有效,但它太慢了。关于优化这个东西的任何提示?
DECLARE @startDate datetime;
DECLARE @endDate datetime;
DECLARE @refInvoiceGuid uniqueidentifier;
SET @startDate='2013-09-01 00:00:00';
SET @endDate='2013-09-30 23:59:59';
SET @refInvoiceGuid='34d03903-a2ad-49ae-bd72-e98b47cdbc52';
SELECT
ProductType.Name,
InvoiceLine.Origin,
invRef.ClicksRef,
invFinal.ClicksFinal,
invNotFinal.ClicksNotFinal
FROM InvoiceLine
INNER JOIN ProductType ON InvoiceLine.ProductType_Ref = ProductType.Id
INNER JOIN (
SELECT
ProductType.Name AS ProductName,
InvoiceLine.Origin AS Origin,
SUM(InvoiceLine.Clicks) AS ClicksRef
FROM InvoiceLine
INNER JOIN ProductType ON InvoiceLine.ProductType_Ref = ProductType.Id
INNER JOIN Invoice ON Invoice.Id = InvoiceLine.Invoice_Ref
WHERE
InvoiceLine.BillingDate >= @startDate
AND InvoiceLine.BillingDate <= @endDate
AND Invoice.Guid = @refInvoiceGuid
GROUP BY
ProductType.Name, InvoiceLine.Origin
) invRef ON ProductType.Name = invRef.ProductName AND InvoiceLine.Origin = invRef.Origin
INNER JOIN (
SELECT
ProductType.Name AS ProductName,
InvoiceLine.Origin AS Origin,
SUM(InvoiceLine.Clicks) AS ClicksFinal
FROM InvoiceLine
INNER JOIN ProductType ON InvoiceLine.ProductType_Ref = ProductType.Id
INNER JOIN Invoice ON Invoice.Id=InvoiceLine.Invoice_Ref AND Invoice.Final=1
WHERE
InvoiceLine.BillingDate >= @startDate
AND InvoiceLine.BillingDate <= @endDate
AND Invoice.Guid != @refInvoiceGuid
GROUP BY
ProductType.Name, InvoiceLine.Origin
) invFinal ON ProductType.Name = invFinal.ProductName AND InvoiceLine.Origin = invFinal.Origin
INNER JOIN (
SELECT
ProductType.Name AS ProductName,
InvoiceLine.Origin AS Origin,
SUM(InvoiceLine.Clicks) AS ClicksNotFinal
FROM InvoiceLine
INNER JOIN ProductType ON InvoiceLine.ProductType_Ref = ProductType.Id
INNER JOIN Invoice ON Invoice.Id=InvoiceLine.Invoice_Ref AND Invoice.Final=0
WHERE
InvoiceLine.BillingDate >= @startDate
AND InvoiceLine.BillingDate <= @endDate
AND Invoice.Guid != @refInvoiceGuid
GROUP BY
ProductType.Name, InvoiceLine.Origin
) invNotFinal ON ProductType.Name = invNotFinal.ProductName AND InvoiceLine.Origin = invNotFinal.Origin
WHERE
InvoiceLine.BillingDate >= @startDate
AND InvoiceLine.BillingDate <= @endDate
GROUP BY
ProductType.Name,
InvoiceLine.Origin,
invRef.ClicksRef,
invFinal.ClicksFinal,
invNotFinal.ClicksNotFinal
更新 1
我添加了一个索引:
CREATE NONCLUSTERED INDEX [IX_ProductOrigin] ON InvoiceLine (Invoice_Ref,BillingDate) INCLUDE (Origin,Clicks,ProductType_Ref);
而且我已经将查询重写为更紧凑(显然这具有相同的性能):
SELECT
ProductType.Name,
InvoiceLine.Origin,
SUM(CASE WHEN Invoice.Guid = @refInvoiceGuid THEN InvoiceLine.Clicks ELSE 0 END) AS ClicksRef,
SUM(CASE WHEN Invoice.Guid <> @refInvoiceGuid AND Invoice.Final = 1 THEN InvoiceLine.Clicks ELSE 0 END) AS ClicksFinal,
SUM(CASE WHEN Invoice.Guid <> @refInvoiceGuid AND Invoice.Final = 0 THEN InvoiceLine.Clicks ELSE 0 END) AS ClicksNotFinal
FROM InvoiceLine
INNER JOIN ProductType ON InvoiceLine.ProductType_Ref = ProductType.Id
INNER JOIN Invoice ON Invoice.id = InvoiceLine.Invoice_Ref
WHERE InvoiceLine.BillingDate >= @startDate AND InvoiceLine.BillingDate <= @endDate
GROUP BY ProductType.Name, InvoiceLine.Origin
它看起来速度提高了,可读性也确实提高了,但实际上它仍然需要 3 分钟才能执行。以下是统计数据和执行计划:
因此,如果我正确理解了统计数据,那么索引查找只需要很长时间吗?关于如何改进这一点的任何想法?还是我刚刚达到数据过多的地步?
以下是该指数的一些统计数据:
和表:
运行'set statistics io on'的结果:
(170 row(s) affected)
Table 'ProductType'. Scan count 0, logical reads 340, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'InvoiceLine'. Scan count 2741, logical reads 37444, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Invoice'. Scan count 1, logical reads 115, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
更新 2
反转索引列的顺序后:
CREATE NONCLUSTERED INDEX [IX_ProductOrigin] ON InvoiceLine (BillingDate,Invoice_Ref) INCLUDE (Origin,Clicks,ProductType_Ref);
(170 row(s) affected)
Table 'ProductType'. Scan count 0, logical reads 340, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'InvoiceLine'. Scan count 1, logical reads 28371, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Invoice'. Scan count 1, logical reads 115, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
带有索引查找统计信息的执行计划:
这显着提高了性能(我们已经从 170 秒变为 15 秒)。感谢您迄今为止的帮助。还有其他建议吗?