1
SELECT 
business_period,
SUM(transaction.transaction_value) AS total_transaction_value,
SUM(transaction.loss_value) AS total_loss_value,
(total_transaction_value - total_loss_value) AS net_value
FROM transaction
GROUP BY business_period

以上内容不起作用,也不total_transaction_valuetotal_loss_value来自transaction表格。有没有办法使这个查询工作?

注意:这个查询涉及5亿行,所以需要高效。

问题:
一些答案建议SUM(transaction.transaction_value) - SUM(transaction.loss_value)已缓存并且不需要再次计算,因为其他人建议我应该作为派生表/子查询以避免重复计算。有人能指出一些可以解决意见分歧的事情吗?

我正在使用 postgres 9.3。

回答:

我想在这里引用 erwin 的评论:

I ran a quick test with 40k rows and the winner was the plain version without subquery. CTE was slowest. So I think my first assumption was wrong and the query planner understands not to calculate the sums repeatedly (makes sense, too). I have seen different results with more complex expressions in the past. The planner does get smarter with every new version

4

4 回答 4

1

利用:

SELECT 
business_period,
SUM(transaction.transaction_value) AS total_transaction_value,
SUM(transaction.loss_value) AS total_loss_value,
(SUM(transaction.transaction_value) - SUM(transaction.loss_value)) AS net_value
FROM transaction
GROUP BY business_period
于 2014-03-07T07:20:10.537 回答
0

sum再次使用

SELECT 
business_period,
SUM(transaction.transaction_value) AS total_transaction_value,
SUM(transaction.loss_value) AS total_loss_value,
(SUM(transaction.transaction_value) - SUM(transaction.loss_value)) AS net_value
FROM transaction
GROUP BY business_period
于 2014-03-07T07:20:53.590 回答
0

只需明确重申 SUM(我相信它们只计算一次):

SELECT 
  business_period,
  SUM(transaction.transaction_value) AS total_transaction_value,
  SUM(transaction.loss_value) AS total_loss_value,
  SUM(transaction.transaction_value) - SUM(transaction.loss_value) AS net_value
FROM transaction
GROUP BY business_period

或者,您可以使用派生表子查询,如果上面没有隐式执行,则它应该强制它只计算一次 - 尽管根据优化器看到的内容可能会有一些额外的开销:

SELECT business_period,
  total_transaction_value,
  total_loss_value,
  (total_transaction_value - total_loss_value) AS net_value
FROM
(
    SELECT 
       business_period,
       SUM(transaction.transaction_value) AS total_transaction_value,
       SUM(transaction.loss_value) AS total_loss_value,
    FROM transaction
    GROUP BY business_period
) x
于 2014-03-07T07:20:53.610 回答
0

使用子查询避免重复计算:

SELECT *, total_transaction_value - total_loss_value AS net_value
FROM  (
   SELECT business_period
        , SUM(transaction_value) AS total_transaction_value
        , SUM(loss_value)        AS total_loss_value
   FROM   transaction
   GROUP  BY 1
   ) sub;

CTE(公用表表达式)来实际强制执行此操作,因为 CTE 构成优化障碍。对于像这样的简单情况,子查询通常更快。当折叠子查询更快时,Postgres 知道得更好。

于 2014-03-07T07:23:02.517 回答