5

有没有办法为 Apache Hive 中的 explode() 函数做一些相反的事情。假设我有一张这种形式的桌子id int, description string, url string, ...

从这个表中我想创建一个表,它看起来像id int, json stringjson列中将所有其他列存储为 json 的位置。"description":"blah blah", "url":"http:", ...

4

2 回答 2

11

Hive has access to some string operations which can be used to combine multiple columns into one column

SELECT id, CONCAT(CONCAT("(", CONCAT_WS(", ", description, url)), ")") as descriptionAndUrl 
FROM originalTable

This is obviously going to get complicated fast for combining many columns into valid JSON. If this is one-of and you know that all of the JSON strings will have the same properties you might get away with just CONCAT for your purposes.

The "right" way to do it would be to write a User Defined Function which takes a list of columns and spits out a JSON string. This will be much more maintainable if you ever need to add columns or do the same thing to other tables.

It's likely someone has already written one you can use, so you should look around. Unfortunately the [JSON related UDFs provided by Hive]https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-get_json_object) work from JSON strings, they don't make them.

于 2013-04-15T16:39:00.560 回答
1

您可以在 HIve 中使用 CONCAT_WS 连接字符串变量

从表中选择 CONCAT_WS('-','string1','string2','string3')

于 2017-07-16T07:49:50.770 回答