我有一个加载的数据如下表:
create table xyzlogTable (dateC string , hours string, minutes string, seconds string, TimeTaken string, Method string, UriQuery string, ProtocolStatus string) row format serde 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' with serdeproperties( "input.regex" = "(\\S+)\\t(\\d+):(\\d+):(\\d+)\\t(\\S+)\\t(\\S+)\\t(\\S+)\\t(\\S+)", "output.format.string" = "%1$s %2$s %3$s %4$s %5$s %6$s %7$s %8$s") stored as textfile;
load data local inpath '/home/hadoop/hive/xyxlogData/' into table xyxlogTable;
总行数被发现是超过300万。 一些查询做工精细,有的陷入无限循环。
眼见选择,通过采取长的时间,有时甚至没有返回结果的查询组后,决定去划分。
但以下两个语句失败:
create table xyzlogTable (datenonQuery string , hours string, minutes string, seconds string, TimeTaken string, Method string, UriQuery string, ProtocolStatus string) partitioned by (dateC string);
FAILED:错误在元数据:AlreadyExistsException(消息:表xyzlogTable已经存在)FAILED:执行错误,从org.apache.hadoop.hive.ql.exec.DDLTask返回代码1
Alter table xyzlogTable (datenonQuery string , hours string, minutes string, seconds string, TimeTaken string, Method string, UriQuery string, ProtocolStatus string) partitioned by (dateC string);
失败:分析错误:行1:12不能在ALTER TABLE语句识别输入的“xyzlogTable”
任何想法什么的问题!