2

说我有下表

month    region   revenue   
------  -------- ---------- 
 jan     north      100
 feb     north      150
 mar     north      250

我如何能够查询上表以获得以下结果?:

month    region   revenue   
------  -------- ---------- 
 jan     north      100
 feb     north      150
 mar     north      250
 apr     north       0
 may     north       0
 jun     north       0

0 可以是空值,反之亦然。本质上是试图在我的查询中添加空/空字段(在本例中为 apr、may、jun 行)。任何帮助将非常感激

谢谢

4

2 回答 2

3

以下是 BigQuery Legacy SQL,但请注意 - 强烈建议 BigQuery 团队迁移到BigQuery Standard SQL

下面的例子应该给你一个想法

#legacySQL
SELECT 
  months.month_abr AS month_abr, 
  regions.region AS region, 
  COALESCE(revenues.revenue, 0) revenue
FROM months
CROSS JOIN (
  SELECT region FROM revenues
) regions
LEFT JOIN revenues
ON months.month_abr = revenues.month_abr
AND regions.region = revenues.region
-- ORDER BY regions.region, months.month_number

revenues包含收入数据的原始表格 在哪里,month是带有月份列表的表格(或者您可以使用下面示例中的子查询)

您可以使用下面的示例使用您问题中的虚拟数据测试/玩上面的内容

#legacySQL
SELECT 
  months.month_abr AS month_abr, 
  regions.region AS region, 
  COALESCE(revenues.revenue, 0) revenue
FROM (
  SELECT month_number, month_abr FROM 
  (SELECT 1 month_number, 'jan' month_abr),
  (SELECT 2 month_number, 'feb' month_abr),
  (SELECT 3 month_number, 'mar' month_abr),
  (SELECT 4 month_number, 'apr' month_abr),
  (SELECT 5 month_number, 'may' month_abr),
  (SELECT 6 month_number, 'jun' month_abr)  
) AS months
CROSS JOIN (
  SELECT region FROM (
    SELECT region FROM 
    (SELECT 'jan' month_abr, 'north' region, 100 revenue),
    (SELECT 'feb' month_abr, 'north' region, 150 revenue),
    (SELECT 'mar' month_abr, 'north' region, 250 revenue)
  ) GROUP BY region
) regions
LEFT JOIN (
  SELECT month_abr, region, revenue FROM 
  (SELECT 'jan' month_abr, 'north' region, 100 revenue),
  (SELECT 'feb' month_abr, 'north' region, 150 revenue),
  (SELECT 'mar' month_abr, 'north' region, 250 revenue)
) AS revenues
ON months.month_abr = revenues.month_abr
AND regions.region = revenues.region
ORDER BY regions.region, months.month_number

结果如下

Row month_abr   region  revenue  
1   jan         north   100  
2   feb         north   150  
3   mar         north   250  
4   apr         north   0    
5   may         north   0    
6   jun         north   0    

最后 - 下面是 BigQuery 标准 SQL 的外观

#standardSQL
WITH regions AS (
  SELECT DISTINCT region FROM revenues
), months AS (
SELECT EXTRACT(MONTH FROM month) month_number,
  LOWER(FORMAT_DATE('%b', month)) month_abr
  FROM UNNEST(GENERATE_DATE_ARRAY(DATE '2010-01-01', DATE '2010-12-01', INTERVAL 1 MONTH)) month
)
SELECT month_abr, region, COALESCE(revenues.revenue, 0) revenue
FROM months
CROSS JOIN regions
LEFT JOIN revenues
USING(month_abr, region)
ORDER BY region, month_number

您可以使用问题中的虚拟数据进行测试,玩这个

#standardSQL
WITH revenues AS (
  SELECT 'jan' month_abr, 'north' region, 100 revenue UNION ALL
  SELECT 'feb', 'north', 150 UNION ALL
  SELECT 'mar', 'north', 250 
), regions AS (
  SELECT DISTINCT region FROM revenues
), months AS (
SELECT EXTRACT(MONTH FROM month) month_number,
  LOWER(FORMAT_DATE('%b', month)) month_abr
  FROM UNNEST(GENERATE_DATE_ARRAY(DATE '2010-01-01', DATE '2010-12-01', INTERVAL 1 MONTH)) month
)
SELECT month_abr, region, COALESCE(revenues.revenue, 0) revenue
FROM months
CROSS JOIN regions
LEFT JOIN revenues
USING(month_abr, region)
ORDER BY region, month_number

您应该能够将上述应用到您的实际用例中

于 2018-03-27T12:46:03.527 回答
2

一种选择 - 使用您想要通过的值列表运行 LEFT/RIGHT JOIN。

让我们从缺少空值/零的查询开始:

#standardSQL
SELECT year, SUM(number) c
FROM `bigquery-public-data.usa_names.usa_1910_current`
WHERE name='Felipe'
AND year>2014
GROUP BY year 
ORDER BY year

在此处输入图像描述

如果我们想为 2015 年之前的值获取 0:

SELECT b.year, IFNULL(c, 0) c
FROM (
  SELECT year, SUM(number) c
  FROM `bigquery-public-data.usa_names.usa_1910_current`
  WHERE name='Felipe'
  AND year>2014
  GROUP BY year 
) a
RIGHT JOIN (
  SELECT year FROM UNNEST(GENERATE_ARRAY(2012, 2016)) year
) b
ON a.year=b.year
ORDER BY year

在此处输入图像描述

相关的子查询也可以节省时间:

SELECT year, (
  SELECT IFNULL(SUM(number), 0) 
  FROM `bigquery-public-data.usa_names.usa_1910_current` a
  WHERE name='Felipe'
  AND year>2014
  AND a.year=b.year
) c
FROM (SELECT year FROM UNNEST(GENERATE_ARRAY(2012, 2016)) year) b
ORDER BY year

在此处输入图像描述

于 2018-03-27T12:42:16.660 回答