我在databrick 的博客上找到了一个地图爆炸示例:
// input
{
"a": {
"b": 1,
"c": 2
}
}
Python: events.select(explode("a").alias("x", "y"))
Scala: events.select(explode('a) as Seq("x", "y"))
SQL: select explode(a) as (x, y) from events
// output
[{ "x": "b", "y": 1 }, { "x": "c", "y": 2 }]
但是,我看不出有一种方法可以将我的地图更改为一个数组,其中键被展平,然后被分解:
// input
{
"id": 0,
"a": {
"b": {"d": 1, "e": 2}
"c": {"d": 3, "e": 4}
}
}
// Schema
struct<id:bigint,a:map<string,struct<d:bigint,e:bigint>>>
root
|-- id: long (nullable = true)
|-- a: map (nullable = true)
| |-- key: string
| |-- value: struct (valueContainsNull = true)
| | |-- d: long (nullable = true)
| | |-- e: long (nullable = true)
// Imagined proces
Python: …
Scala: events.select('id, explode('a) as Seq("x", "*")) //? "*" ?
SQL: …
// Desired output
[{ "id": 0, "x": "b", "d": 1, "e": 2 }, { "id": 0, "x": "c", "d": 3, "e": 4 }]
是否有一些明显的方法可以采用这样的输入来制作如下表格:
id | x | d | e
---|---|---|---
0 | b | 1 | 2
0 | c | 3 | 4