6

我正在努力寻找一种好方法来运行包含 group by 或等效的运行总计。下面基于游标的运行总计适用于完整的表格,但我想扩展它以添加“客户”维度。所以我会得到运行总计,如下所示,但对于每个公司(即公司 A、公司 B、公司 C 等)在一个表中

CREATE TABLE test (tag int,  Checks float, AVG_COST float, Check_total float,  Check_amount float, Amount_total float, RunningTotal_Check float,  
 RunningTotal_Amount float)

DECLARE @tag int,
        @Checks float,
        @AVG_COST float,
        @check_total float,
        @Check_amount float,
        @amount_total float,
        @RunningTotal_Check float ,
        @RunningTotal_Check_PCT float,
        @RunningTotal_Amount float



SET @RunningTotal_Check = 0
SET @RunningTotal_Check_PCT = 0
SET @RunningTotal_Amount = 0
DECLARE aa_cursor CURSOR fast_forward
FOR
SELECT tag, Checks, AVG_COST, check_total, check_amount, amount_total
FROM test_3

OPEN aa_cursor
FETCH NEXT FROM aa_cursor INTO @tag,  @Checks, @AVG_COST, @check_total, @Check_amount, @amount_total
WHILE @@FETCH_STATUS = 0
 BEGIN
  SET @RunningTotal_CHeck = @RunningTotal_CHeck + @checks
  set @RunningTotal_Amount = @RunningTotal_Amount + @Check_amount
  INSERT test VALUES (@tag, @Checks, @AVG_COST, @check_total, @Check_amount, @amount_total,  @RunningTotal_check, @RunningTotal_Amount )
  FETCH NEXT FROM aa_cursor INTO @tag, @Checks, @AVG_COST, @check_total, @Check_amount, @amount_total
 END

CLOSE aa_cursor
DEALLOCATE aa_cursor

SELECT *, RunningTotal_Check/Check_total as CHECK_RUN_PCT, round((RunningTotal_Check/Check_total *100),0) as CHECK_PCT_BIN,  RunningTotal_Amount/Amount_total as Amount_RUN_PCT,  round((RunningTotal_Amount/Amount_total * 100),0) as Amount_PCT_BIN
into test_4
FROM test ORDER BY tag
create clustered index IX_TESTsdsdds3 on test_4(tag)

DROP TABLE test

----------------------------------

我可以计算任何 1 家公司的运行总数,但我想对多个公司执行此操作以产生类似于以下结果的结果。

CLIENT  COUNT   Running Total
Company A   1   6.7%
Company A   2   20.0%
Company A   3   40.0%
Company A   4   66.7%
Company A   5   100.0%
Company B   1   3.6%
Company B   2   10.7%
Company B   3   21.4%
Company B   4   35.7%
Company B   5   53.6%
Company B   6   75.0%
Company B   7   100.0%
Company C   1   3.6%
Company C   2   10.7%
Company C   3   21.4%
Company C   4   35.7%
Company C   5   53.6%
Company C   6   75.0%
Company C   7   100.0%
4

3 回答 3

5

我最初开始发布 SQL Server 2012 等效版本(因为您没有提及您使用的版本)。Steve 在最新版本的 SQL Server 中展示了这种计算的简单性方面做得很好,因此我将重点介绍适用于早期版本的 SQL Server(回溯到 2005 年)的一些方法。

我将对您的架构采取一些自由,因为我无法弄清楚所有这些 #test 和 #test_3 和 #test_4 临时表应该代表什么。怎么样:

USE tempdb;
GO

CREATE TABLE dbo.Checks
(
  Client VARCHAR(32),
  CheckDate DATETIME,
  Amount DECIMAL(12,2)
);

INSERT dbo.Checks(Client, CheckDate, Amount)
          SELECT 'Company A', '20120101', 50
UNION ALL SELECT 'Company A', '20120102', 75
UNION ALL SELECT 'Company A', '20120103', 120
UNION ALL SELECT 'Company A', '20120104', 40
UNION ALL SELECT 'Company B', '20120101', 75
UNION ALL SELECT 'Company B', '20120105', 200
UNION ALL SELECT 'Company B', '20120107', 90;

在这种情况下的预期输出:

Client    Count Running Total
--------- ----- -------------
Company A 1     17.54
Company A 2     43.86
Company A 3     85.96
Company A 4     100.00
Company B 1     20.55
Company B 2     75.34
Company B 3     100.00

单程:

;WITH gt(Client, Totals) AS 
(
  SELECT Client, SUM(Amount) 
    FROM dbo.Checks AS c
    GROUP BY Client
), n (Client, Amount, rn) AS
(
  SELECT c.Client, c.Amount, 
    ROW_NUMBER() OVER  (PARTITION BY c.Client ORDER BY c.CheckDate)
    FROM dbo.Checks AS c
)
SELECT n.Client, [Count] = n.rn, 
  [Running Total] = CONVERT(DECIMAL(5,2), 100.0*(
    SELECT SUM(Amount) FROM n AS n2 
    WHERE Client = n.Client AND rn <= n.rn)/gt.Totals
 )
 FROM n INNER JOIN gt ON n.Client = gt.Client
 ORDER BY n.Client, n.rn;

一个稍快的替代方案 - 更多读取但更短的持续时间和更简单的计划:

;WITH x(Client, CheckDate, rn, rt, gt) AS 
(
   SELECT Client, CheckDate, rn = ROW_NUMBER() OVER
   (PARTITION BY Client ORDER BY CheckDate),
    (SELECT SUM(Amount) FROM dbo.Checks WHERE Client = c.Client 
      AND CheckDate <= c.CheckDate),
    (SELECT SUM(Amount) FROM dbo.Checks WHERE Client = c.Client)
FROM dbo.Checks AS c
)
SELECT Client, [Count] = rn, 
  [Running Total] = CONVERT(DECIMAL(5,2), rt * 100.0/gt)
  FROM x
  ORDER BY Client, [Count];

虽然我在这里提供了基于集合的替代方案,但根据我的经验,我观察到游标通常是执行运行总计的最快支持方式。还有其他方法,例如古怪的更新,其执行​​速度略快,但不能保证结果。随着源行数的增加,执行自连接的基于集合的方法变得越来越昂贵 - 所以在使用小表进行测试时似乎表现良好,随着表变大,性能下降。

我有一篇几乎完全准备好的博客文章,其中对各种运行总计方法进行了稍微简单的性能比较。它更简单,因为它没有分组,它只显示总数,而不是运行总百分比。我希望尽快发布这篇文章,并尽量记住更新这个空间。

还有另一种需要考虑的替代方法,它不需要多次读取先前的行。这是 Hugo Kornelis 将其描述为“基于集合的迭代”的概念。我不记得我是在哪里第一次学习这种技术的,但在某些情况下它很有意义。

DECLARE @c TABLE
(
 Client VARCHAR(32), 
 CheckDate DATETIME,
 Amount DECIMAL(12,2),
 rn INT,
 rt DECIMAL(15,2)
);

INSERT @c SELECT Client, CheckDate, Amount,
  ROW_NUMBER() OVER (PARTITION BY Client
 ORDER BY CheckDate), 0
 FROM dbo.Checks;

DECLARE @i INT, @m INT;
SELECT @i = 2, @m = MAX(rn) FROM @c;

UPDATE @c SET rt = Amount WHERE rn = 1;

WHILE @i <= @m
BEGIN
    UPDATE c SET c.rt = c2.rt + c.Amount
      FROM @c AS c
      INNER JOIN @c AS c2
      ON c.rn = c2.rn + 1
      AND c.Client = c2.Client
      WHERE c.rn = @i;

    SET @i = @i + 1;
END

SELECT Client, [Count] = rn, [Running Total] = CONVERT(
  DECIMAL(5,2), rt*100.0 / (SELECT TOP 1 rt FROM @c
 WHERE Client = c.Client ORDER BY rn DESC)) FROM @c AS c;

虽然这确实执行了一个循环,并且每个人都告诉您循环和游标不好,但这种方法的一个好处是,一旦计算了前一行的运行总数,我们只需要查看前一行而不是对所有前面的行求和. 另一个好处是,在大多数基于游标的解决方案中,您必须遍历每个客户端,然后进行每次检查。在这种情况下,您将检查所有客户的第一次检查一次,然后检查所有客户的第二次检查一次。因此,我们只进行 (max check count) 迭代,而不是 (client count * avg check count) 迭代。此解决方案对于简单的运行总计示例没有多大意义,但对于分组运行总计示例,应针对上述基于集合的解决方案进行测试。但是,如果您使用的是 SQL Server 2012,它就不可能击败 Steve 的方法。

更新

我在这里写了关于各种运行总计方法的博客:

http://www.sqlperformance.com/2012/07/t-sql-queries/running-totals

于 2012-04-29T01:22:51.047 回答
5

这在 SQL Server 2012 中终于很简单了,其中 SUM 和 COUNT 支持包含 ORDER BY 的 OVER 子句。使用 Cris 的 #Checks 表定义:

SELECT
  CompanyID,
  count(*) over (
    partition by CompanyID
    order by Cleared, ID
  ) as cnt,
  str(100.0*sum(Amount) over (
    partition by CompanyID
    order by Cleared, ID
  )/
  sum(Amount) over (
    partition by CompanyID
  ),5,1)+'%' as RunningTotalForThisCompany
FROM #Checks;

SQL Fiddle在这里

于 2012-04-29T00:11:33.973 回答
0

我并不完全理解您从中提取的架构,但这里有一个使用临时表的快速查询,它显示了如何在基于集合的操作中进行运行总计。

CREATE TABLE #Checks
(
     ID int IDENTITY(1,1) PRIMARY KEY
    ,CompanyID int NOT NULL
    ,Amount float NOT NULL
    ,Cleared datetime NOT NULL
)

INSERT INTO #Checks
VALUES
     (1,5,'4/1/12')
    ,(1,5,'4/2/12')
    ,(1,7,'4/5/12')
    ,(2,10,'4/3/12')

SELECT Info.ID, Info.CompanyID, Info.Amount, RunningTotal.Total, Info.Cleared
FROM
(
SELECT main.ID, SUM(other.Amount) as Total
FROM
    #Checks main
JOIN
    #Checks other
ON
    main.CompanyID = other.CompanyID
AND
    main.Cleared >= other.Cleared
GROUP BY
    main.ID) RunningTotal
JOIN
    #Checks Info
ON
    RunningTotal.ID = Info.ID

DROP TABLE #Checks
于 2012-04-28T23:45:31.250 回答