1

我有一个 sqlEmployee表,它描述了用户对特定金属的喜爱程度,该表看起来像这样

    "Employee_number" "Rank_1  "Rank_2   "Rank_3   "Rank_4   "Rank5
                   1    Gold    null      null      null    null
                   2    bronze  Gold      null      null    null
                   3    Gold    platinum  null      null    null
                   4    Gold    copper    null      null    null
                   5    Gold    bronze    platinum  null    null
                   6    Gold    bronze    platinum  null    null
                   7    Gold    platinum  Silver    null    null
                   8    Gold    platinum  Silver    null    null
                   9    Gold    platinum  business  null    null
                   10   null    null      null      null    null
                   11   Silver  bronze    business  platinum Gold

Employee_number 字段是一个唯一字段,

还有一个表格描述了金属的一般排名,它看起来像这样:

Metal     Rank
Gold        1
platinum    2
silver      3
copper      4
bronze      5

我想做的是,每当员工有空值时,根据他们的排名填写默认金属

例如-> 对于员工 10:所有值均为 null ,easy ,他的 rank_1 金属为 Gold , rank2_metal 为 Platinum , rank3_metal 为 Silver , rank 4_metal 为铜 , rank 5_metal 为青铜

现在对于employee_1,他已经拥有rank_1 金属,但没有其他可用的rank ,所以将rank2_metal 替换为Platinum ,将rank_3 金属替换为银,将rank_4 金属替换为铜,将rank_5 金属替换为青铜

现在对于employee_2,他的第一金属是青铜,第二金属是金,他的rank3_metal是Platinum,rank_4 metal是silver,rank5_metal是铜

同理,以employee_6为例,他填了三个等级,需要填写等级4和5,他的等级_4金属是银,等级_5金属是铜。

有没有人对这如何成为 sql 中的一个有任何建议,我正在使用 bigquery

4

1 回答 1

3

以下是 BigQuery 标准 SQL - 希望您将其用于您的实际用例。

#standardSQL
WITH metals AS (
  SELECT 'Gold' Metal, 1 RANK UNION ALL SELECT 'platinum', 2 UNION ALL
  SELECT 'silver', 3 UNION ALL SELECT 'copper', 4 UNION ALL SELECT 'bronze', 5 
)
SELECT Employee_number, 
  MAX(IF(pos=0, Metal, NULL)) Rank_1,
  MAX(IF(pos=1, Metal, NULL)) Rank_2,
  MAX(IF(pos=2, Metal, NULL)) Rank_3,
  MAX(IF(pos=3, Metal, NULL)) Rank_4,
  MAX(IF(pos=4, Metal, NULL)) Rank_5
FROM (
  SELECT Employee_number,
    ARRAY_CONCAT(
      ARRAY(SELECT Metal FROM (
          SELECT 1 a, Rank_1 Metal UNION ALL SELECT 2, Rank_2 UNION ALL 
          SELECT 3, Rank_3 UNION ALL SELECT 4, Rank_4 UNION ALL 
          SELECT 5, Rank_5 )
        WHERE NOT Metal IS NULL
        ORDER BY a
      ), ARRAY(SELECT Metal FROM metals m
        WHERE NOT LOWER(Metal) IN (
          SELECT x FROM UNNEST(ARRAY(
            SELECT LOWER(b) FROM (
              SELECT Rank_1 b UNION ALL SELECT Rank_2 UNION ALL
              SELECT Rank_3 UNION ALL SELECT Rank_4 UNION ALL
              SELECT Rank_5 )
            WHERE NOT b IS NULL
          )) x
        ) ORDER BY RANK
      )) arr
  FROM `project.dataset.employee`
), UNNEST(arr) Metal WITH OFFSET pos  
GROUP BY Employee_number
ORDER BY Employee_number    

您可以使用您问题中的虚拟数据进行测试,如下所示

#standardSQL
WITH `project.dataset.employee` AS (
  SELECT 1 Employee_number, 'Gold' Rank_1, NULL Rank_2, NULL Rank_3, NULL Rank_4, NULL Rank_5 UNION ALL
  SELECT 2, 'bronze', 'Gold', NULL, NULL, NULL UNION ALL
  SELECT 3, 'Gold', 'platinum', NULL, NULL, NULL UNION ALL
  SELECT 4, 'Gold', 'copper', NULL, NULL, NULL UNION ALL
  SELECT 5, 'Gold', 'bronze', 'platinum', NULL, NULL UNION ALL
  SELECT 6, 'Gold', 'bronze', 'platinum', NULL, NULL UNION ALL
  SELECT 7, 'Gold', 'platinum', 'Silver', NULL, NULL UNION ALL
  SELECT 8, 'Gold', 'platinum', 'Silver', NULL, NULL UNION ALL
  SELECT 9, 'Gold', 'platinum', 'business', NULL, NULL UNION ALL
  SELECT 10, NULL, NULL, NULL, NULL, NULL UNION ALL
  SELECT 11, 'Silver', 'bronze', 'business', 'platinum',  'Gold' 
), metals AS (
  SELECT 'Gold' Metal, 1 RANK UNION ALL SELECT 'platinum', 2 UNION ALL
  SELECT 'silver', 3 UNION ALL SELECT 'copper', 4 UNION ALL SELECT 'bronze', 5 
)
SELECT Employee_number, 
  MAX(IF(pos=0, Metal, NULL)) Rank_1,
  MAX(IF(pos=1, Metal, NULL)) Rank_2,
  MAX(IF(pos=2, Metal, NULL)) Rank_3,
  MAX(IF(pos=3, Metal, NULL)) Rank_4,
  MAX(IF(pos=4, Metal, NULL)) Rank_5
FROM (
  SELECT Employee_number,
    ARRAY_CONCAT(
      ARRAY(SELECT Metal FROM (
          SELECT 1 a, Rank_1 Metal UNION ALL SELECT 2, Rank_2 UNION ALL 
          SELECT 3, Rank_3 UNION ALL SELECT 4, Rank_4 UNION ALL 
          SELECT 5, Rank_5 )
        WHERE NOT Metal IS NULL
        ORDER BY a
      ), ARRAY(SELECT Metal FROM metals m
        WHERE NOT LOWER(Metal) IN (
          SELECT x FROM UNNEST(ARRAY(
            SELECT LOWER(b) FROM (
              SELECT Rank_1 b UNION ALL SELECT Rank_2 UNION ALL
              SELECT Rank_3 UNION ALL SELECT Rank_4 UNION ALL
              SELECT Rank_5 )
            WHERE NOT b IS NULL
          )) x
        ) ORDER BY RANK
      )) arr
  FROM `project.dataset.employee`
), UNNEST(arr) Metal WITH OFFSET pos  
GROUP BY Employee_number
ORDER BY Employee_number      

结果

Row Employee_number Rank_1  Rank_2      Rank_3      Rank_4      Rank_5   
1   1               Gold    platinum    silver      copper      bronze   
2   2               bronze  Gold        platinum    silver      copper   
3   3               Gold    platinum    silver      copper      bronze   
4   4               Gold    copper      platinum    silver      bronze   
5   5               Gold    bronze      platinum    silver      copper   
6   6               Gold    bronze      platinum    silver      copper   
7   7               Gold    platinum    Silver      copper      bronze   
8   8               Gold    platinum    Silver      copper      bronze   
9   9               Gold    platinum    business    silver      copper   
10  10              Gold    platinum    silver      copper      bronze   
11  11              Silver  bronze      business    platinum    Gold     

注意:上述解决方案假设填充金属和空金属之间没有混合,这意味着三个选项:

1. all Rank fields filled already with Metal  
2. all Rank fields are NULL
3. first 1 or more fields filled with Metal and rest are NULLs 

话虽如此,第一个数组是由填充字段构建的;第二个数组由 Metals 表中的其余 Metal 字段构建;然后连接两个数组,前 5 个元素用于重新创建原始表

希望这不会太混乱

PS 上述解决方案可以相对容易地扩展到 NULL 和填充金属混合的情况 - 但看起来这超出了问题的范围:o)

于 2018-03-27T19:09:32.730 回答