0

我目前有以下完美工作的查询,但我想知道它是否可以优化(也许首先避免使用 UNNEST,然后再使用 GROUP BY 并一步进行转换)。

with src as (
    select 1 as row_key, "key_A:value_A,key_B:value_B,key_C:value_C" as field_raw
), tmp as (
    select
        row_key
        , STRUCT(
            split(field_items, ':')[offset(0)] as key
            , split(field_items, ':')[offset(1)] as value
        ) AS field_items
    from src
    , unnest(split(field_raw, ',')) field_items
)
select
    row_key
    , ARRAY_AGG(field_items) as field_items
from tmp
group by row_key

输入 :

行键 field_raw
1 key_A:value_A,key_B:value_B,key_C:value_C

预期输出:

行键 field_items.key field_items.value
1 键_A 值_A
键_B 价值_B
键_C 价值_C

感谢帮助 :)

4

1 回答 1

2

考虑以下重构方法

select row_key, 
  array(select as struct
      split(kv, ':')[offset(0)] as key, 
      split(kv, ':')[offset(1)] as value
    from t.arr as kv 
  ) as field_items
from src, 
unnest([struct(regexp_extract_all(field_raw, r'\w+:\w+') as arr)]) t    

如果应用于您问题中的样本数据 - 输出是

在此处输入图像描述

于 2021-12-08T17:50:14.087 回答