2

我有一个带有数组列的 BigQuery 表,其中包含多个(1 到 4 个)键:值对,由管道“|”分隔。我想提取键:值对并添加其他列,其中“键”作为列标题,“值”作为......以及值/条目。

但是,虽然有统一的“键”,但它们并不是都按相同的顺序放置,因此按顺序拆分并不能安静地工作。我环顾四周,探索了“JSON_EXTRACT_SCALAR”和“UNNEST”(来自这个问题/答案:从 sql BigQuery 中的数组对象中获取键值对),但无法获得预期的结果。我也探索过使用“OFFSET”,但不知道如何将它们放在一起。

这是数据字段(它们的数组)的样子:

Row campaignLabels  
1   Segment: Rivers Non-Brand | Strategy: All Else | Category: Non-Brand | CN:Pause_5-29-19
2   Segment: Rivers Non-Brand | Category: Non-Brand | Strategy: All Else | CN:Pause_5-29-19
3   Category: Upper Funnel | Strategy: All Else
4   Strategy: All Else | Segment: Rivers Brand | Category: Brand
5   Strategy: All Else | Category: Brand | Segment: Rivers Brand
6   Segment: Rivers Non-Brand | Category: Non-Brand | Strategy: All Else
7   Strategy: All Else | Segment: Viking Other Brand | Category: Brand
8   Strategy: All Else | Category: Brand | Segment: Rivers Brand
9   Strategy: All Else | Category: Brand | Segment: Rivers Brand
10  Strategy: All Else | Category: Brand | Segment: Viking Other Brand

理想的输出是查询同一个表,拉取某些列,并添加具有“Strategy”、“Category”和“Segment”的列作为列标签,并将值作为返回值。

帮助!

一些尝试给了我部分但不是所需的结果:

SELECT 
  DISTINCT(SUBSTR(Part1, 10)) AS Strategy
FROM (
  SELECT
    Labels[OFFSET(0)] AS Part1,
    Labels[OFFSET(1)] AS Part2,
    Labels[SAFE_OFFSET(2)] AS Part3,
    Labels[SAFE_OFFSET(3)] AS Part4
  FROM (
    SELECT
      SPLIT(campaignLabels,"| ") AS Labels
    FROM
      `table_A` )
  )
WHERE Part1 LIKE "Strategy:%"
4

1 回答 1

1

以下是 BigQuery 标准 SQL

#standardSQL
select campaignLabels, 
  ( select as struct
      max(if(key = 'Segment', value, null)) as Segment,
      max(if(key = 'Strategy', value, null)) as Strategy,
      max(if(key = 'Category', value, null)) as Category
    from (
      select as struct kv[offset(0)] as key, trim(kv[offset(1)]) as value
      from t.labels label, 
      unnest([struct(split(label, ':') as kv)])
    )
  ).*
from `project.dataset.table`, 
unnest([struct(split(campaignLabels, ' | ') as labels)]) t    

如果将我们的问题应用于来自 y 的样本数据 - 输出是

在此处输入图像描述

于 2020-11-14T20:02:11.130 回答