我有一个加载数据的表,如下所示:
create table xyzlogTable (dateC string , hours string, minutes string, seconds string, TimeTaken string, Method string, UriQuery string, ProtocolStatus string) row format serde 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' with serdeproperties( "input.regex" = "(\\S+)\\t(\\d+):(\\d+):(\\d+)\\t(\\S+)\\t(\\S+)\\t(\\S+)\\t(\\S+)", "output.format.string" = "%1$s %2$s %3$s %4$s %5$s %6$s %7$s %8$s") stored as textfile;
load data local inpath '/home/hadoop/hive/xyxlogData/' into table xyxlogTable;
总行数被发现超过 300 万。一些查询工作正常,一些进入无限循环。
在看到该选择后,按查询分组需要很长时间,有时甚至不返回结果,决定进行分区。
但是以下两个语句都失败了:
create table xyzlogTable (datenonQuery string , hours string, minutes string, seconds string, TimeTaken string, Method string, UriQuery string, ProtocolStatus string) partitioned by (dateC string);
失败:元数据错误:AlreadyExistsException(消息:表 xyzlogTable 已存在)失败:执行错误,从 org.apache.hadoop.hive.ql.exec.DDLTask 返回代码 1
Alter table xyzlogTable (datenonQuery string , hours string, minutes string, seconds string, TimeTaken string, Method string, UriQuery string, ProtocolStatus string) partitioned by (dateC string);
失败:解析错误:第 1:12 行无法识别更改表语句中的输入“xyzlogTable”
任何想法是什么问题!