google-bigquery - DataPrep 中的数组字段在 BigQuery 中被识别为字符串

Question

我正在尝试使用 dataprep 来整理我的数据以进行报告。

但是，应该是数组的字段在 bigquery 中被识别为字符串。

示例数据：

{"name":"herman","age":34,"property":[{"address":"henry street","state":"vic"},{"address":"mount waverley","state":"vic"}]}
{"name":"Handry","age":61,"property":[{"address":"Balwyn","state":"vic"},{"address":"Clayton","state":"vic"}]}

基本上，我只想将某些字段设为大写。（这只是一个简单的示例，我有非常复杂的转换）这是 wrangle 文件：

flatten col: property
unnest col: property keys: 'state','address' markLineage: true
drop col: property
derive value: upper(property_address) as: 'upper_property_address'
drop col: property_address
derive value: upper(property_state) as: 'upper_property_state'
drop col: property_state
nest col: upper_property_state,upper_property_address as: 'column1'
derive value: list(column1, 1000) group: name as: 'column2'
drop col: column1,upper_property_address,upper_property_state
deduplicate

最后导致 bigquery（在表中）：

{"age":"61","name":"Handry","column2":"[{\"upper_property_state\":\"VIC\",\"upper_property_address\":\"BALWYN\"},{\"upper_property_state\":\"VIC\",\"upper_property_address\":\"CLAYTON\"}]"}
{"age":"34","name":"herman","column2":"[{\"upper_property_state\":\"VIC\",\"upper_property_address\":\"HENRY STREET\"},{\"upper_property_state\":\"VIC\",\"upper_property_address\":\"MOUNT WAVERLEY\"}]"}

我也在谷歌问题中问过这个问题。 https://issuetracker.google.com/issues/69773118

只是想知道，有人有这个问题并有解决方法吗？ 我知道我们可以在 biquery 中查询 JSON，如下所述： How to query json stored as string in bigquery table?

但它使查询变得复杂，我想避免它。

google-bigquery - DataPrep 中的数组字段在 BigQuery 中被识别为字符串

0 回答 0

Related

Reference