0

我有一张带有hiredate (Date)、First Name (String) 和 Sur Name (string) 的表,如下所示:

hireDate    First Name      Surname
13-oct-14   Cintia Roxana   Padilla Julca
28-oct-14   Conor           McAteer
28-oct-14   Paolo           Mesia Macher
28-oct-14   William Anthony Whelan
15-nov-14   Peter Michael   Coates
13-feb-15   Natalie         Conche
15-mar-15   Beatriz         Vargas Huanca
01-may-15   Walter          Calle Chenccnes
04-may-15   Sarah Louise    Price

我查看了hire_dates(DATE) 的频率和另一列中的累积频率,如下所示:

Row hireDate    Count       Cumulative
1   13/10/2014  1           1
2   28/10/2014  3           4
3   15/11/2014  1           5
4   13/02/2015  1           6
5   15/03/2015  1           7
6   09/04/2015  1           8
7   15/04/2015  1           9
8   01/05/2015  1           10

查询如下:

WITH
Data AS (
 SELECT
 hireDate,
 COUNT(1) AS Count
 FROM
 `human-resources-221122.human_resources.employees_view`
 WHERE
 status <> "cancelled"
 GROUP BY
 1 )

SELECT
hireDate,
Count,
SUM(Count) OVER (ORDER BY hireDate ASC ROWS BETWEEN UNBOUNDED PRECEDING 
AND CURRENT ROW) AS Cumulative
FROM
Data
ORDER BY
hireDate ASC

但是我需要在那些没有计数的地方按月和年查看数字,这些数字是这样的:

Hire_Month  Hire_Year   Count   Cumulative
October     2014        4       4
November    2014        1       5
December    2014        0       5
January     2015        0       5
February    2015        1       6
March       2015        1       7
April       2015        2       9
May         2015        1       10

提前致谢。

4

1 回答 1

0

请注意使用GENERATE_DATE_ARRAYandRIGHT JOIN以获得所需的结果:

WITH data AS (
  SELECT * 
  FROM UNNEST ([
    STRUCT(DATE("2014-12-03") AS d, 4 AS a)
    , STRUCT("2015-01-05", 7)
    , STRUCT("2015-03-05", 1)
  ])
), all_months AS (
   SELECT month
   FROM UNNEST(GENERATE_DATE_ARRAY(
     (SELECT DATE_TRUNC(MIN(d), MONTH) FROM data)
     , (SELECT MAX(d) FROM data)
     , INTERVAL 1 MONTH)
   ) AS month
)


SELECT month, IFNULL(SUM(a),0) a, SUM(SUM(a)) OVER(ORDER BY month) a_cum
FROM data 
RIGHT JOIN all_months
ON DATE_TRUNC(d, MONTH)=month
GROUP BY month
ORDER BY month

在此处输入图像描述

现在,如果我们只是在计数,您可以使用 LEFT/RIGHT JOIN 将在空月中包含空值这一事实。这就是查询可以适应任意表的方式(此处为维基百科):

WITH data AS (
  SELECT *, DATE(datehour) d
  FROM `fh-bigquery.wikipedia_v3.pageviews_2018` 
  WHERE wiki='pt'
    AND (datehour BETWEEN '2018-09-30' AND '2018-09-30'
      OR datehour BETWEEN '2018-12-01' AND '2018-12-02'
    )
    AND title LIKE 'Calif%'
), all_months AS (
   SELECT month
   FROM UNNEST(GENERATE_DATE_ARRAY(
     (SELECT DATE_TRUNC(MIN(d), MONTH) FROM data)
     , (SELECT MAX(d) FROM data)
     , INTERVAL 1 MONTH)
   ) AS month
)


SELECT month, COUNT(d) c, SUM(COUNT(d)) OVER(ORDER BY month) a_cum
FROM data 
RIGHT JOIN all_months
ON DATE_TRUNC(d, MONTH)=month
GROUP BY month
ORDER BY month

在此处输入图像描述

于 2019-02-14T22:50:37.107 回答