0

我正在尝试在基于以下 json 内容创建的镶木地板表上放置一个蜂巢表:
{"user_id":"4513","providers":[{"id":"4220","name" :"dbmvl","行为":{"b1":"gxybq","b2":"ntfmx"}},{"id":"4173","name":"dvjke","行为":{ "b1":"sizow","b2":"knuuc"}}]}

{"user_id":"3960","providers":[{"id":"1859","name":"ponsv", "行为":{"b1":"ahfgc","b2":"txpea"}},{"id":"103","name":"uhqqo","行为":{"b1":" lktyo","b2":"ituxy"}}]}

{"user_id":"567","providers":[{"id":"9622","name":"crjju","behaviors":{ "b1":"rhaqc","b2":"npnot"}},{"id":"6965","name":"fnheh","behaviors":{"b1":"eipse","b2 ":"nvxqk"}}]}"nvxqk"}}]}"nvxqk"}}]}

我基本上使用 spark sql 来读取 json 并写出 parquet 文件。

我遇到了将配置单元放在生成的镶木地板文件之上的问题。这是我拥有的配置单元 hql:
create table test (mycol STRUCT<user_id:String, providers:ARRAY<STRUCT<id:String, name:String, behaviors:MAP<String, String>>>>) stored as parquet; Alter table test set location 'hdfs:///tmp/test.parquet'; 上述语句执行良好,但是当我尝试在表上执行 select * 时出现错误:
失败并出现异常 java.io.IOException:java.lang.IllegalStateException: Column mycol at index 0 {providers=providers, user_id=user_id} 中不存在

4

1 回答 1

1

尝试将您的查询更改为:

create table test (user_id:String, providers:ARRAY<STRUCT<id:String, name:String, behaviors:MAP<String, String>>>) stored as parquet;

存储 Parquet 文件时,根 JSON 对象被展平。

于 2015-03-16T18:57:56.763 回答