我的值仍然为空......
模式字符串的正则表达式空格字符串空格日期空间字符串直到行尾
([^ ]*)\s([^ ]*)\s(\[[0-9][0-9]\/[A-Za-z]{3}\/[0-9]{4}:[0-9]{2}:[0-9]{2}:[0-9]{2} \+0000\])\s(.*$)
它应该匹配的行类型(随机生成的行)
filesystem af68ccf949ebc07c250b37a10fa40912 [20/Aug/2013:19:00:11 +0000] fbec6e8ec3fa6687426f8437cdd8593f346081ca1978057a
在http://rubular.com/上似乎是正确的
创建表:
CREATE TABLE example1 (
user STRING,
bucket STRING,
date STRING,
rest STRING )
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
WITH SERDEPROPERTIES (
"input.regex" = "([^ ]*)\s([^ ]*)\s(\[[0-9][0-9]\/[A-Za-z]{3}\/[0-9]{4}:[0-9]{2}:[0-9]{2}:[0-9]{2} \+0000\])\s(.*$)",
"output.format.string" = "%1$s %2$s %3$s %4$s"
)
STORED AS TEXTFILE