I have hive table created as below:
create table alpha001(id int, name string) clustered by (id) into 2 buckets stored as orc TBLPROPERTIES ('transactional'='true')
Now i want to drop one of the columns, say 'name'. I tried the following:
- ALTER TABLE alpha001 REPLACE COLUMNS (id int)
Exception thrown: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Replace columns is not supported for table default.alpha001. SerDe may be incompatible.
- ALTER TABLE alpha001 DROP name
Exception thrown : FAILED: ParseException line 1:26 mismatched input 'name' expecting PARTITION near 'DROP' in drop partition statement
Can someone help me with this ? please
Unfortunately, you can't! The only way you can delete column from existing table is by using REPLACE COLUMNS
keyword. But this can be done only for tables with a native SerDe (DynamicSerDe, MetadataTypedColumnsetSerDe, LazySimpleSerDe and ColumnarSerDe).
Your best bet is recreating the schema. Follows the steps.
Check if the table is external. If it isn't, use the following statement to make it external.
alter table alpha001 set tblproperties('EXTERNAL'='TRUE');
Drop the table. Since the table is an external table, you can drop it without dropping the actual table.
- Recreate the table with the new schema. You should be able to access the table with new schema.
Follows a quick sample.
create table alpha001(id int, name string) clustered by (id) into 2 buckets stored as orc TBLPROPERTIES ('transactional'='true');
--assuming your table is not EXTERNAL already
alter table alpha001 set tblproperties('EXTERNAL'='TRUE');
insert into alpha001 values(1,"A");
select * from alpha001;
OK
1 A
drop table alpha001;
create table alpha001(id int) clustered by (id) into 2 buckets stored as orc TBLPROPERTIES ('transactional'='true');
select * from alpha001;
OK
1
Time tak
Hope that helps!
you cant drop column for hive table.
On using "ALTER TABLE alpha001 DROP name" it gives error because in hive dropping column is not supported but we cant drop partition.So here ,it searches for Partition name near drop which 'name' column is not.
So in order to drop column ,you have to
1.drop table and re-create table with column eliminated(but may result in loss of data if its internal table)
2.you can use REPLACE command(refer this https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AlterTable/Partition/Column)
You are using the replace command in a wrong way, you need to omit the columns that you want to drop while using replace command.
describe formatted bucketed_table;
OK
col_name data_type comment
col_name data_type comment
id int
firstname string
last_name string
hive>alter table bucketed_table drop column id;
MismatchedTokenException(58!=196)
at org.antlr.runtime.BaseRecognizer.recoverFromMismatchedToken(BaseRecognizer.java:61 7)
at org.antlr.runtime.BaseRecognizer.match(BaseRecognizer.java:115)
at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.dropPartitionSpec(Hiv eParser_IdentifiersParser.java:10839)
at org.apache.hadoop.hive.ql.parse.HiveParser.dropPartitionSpec(HiveParser.java:49854 )
at org.apache.hadoop.hive.ql.parse.HiveParser.alterStatementSuffixDropPartitions(Hive Parser.java:13140)
at org.apache.hadoop.hive.ql.parse.HiveParser.alterTableStatementSuffix(HiveParser.ja va:9370)
at org.apache.hadoop.hive.ql.parse.HiveParser.alterStatement(HiveParser.java:8803)
at org.apache.hadoop.hive.ql.parse.HiveParser.ddlStatement(HiveParser.java:4057)
at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1786)
at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1152)
at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:211)
at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:171)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:438)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:321)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1224)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1265)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1151)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:217)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:169)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:380)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:740)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:685)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java: 43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
FAILED: ParseException line 1:32 mismatched input 'column' expecting PARTITION near 'drop' in drop partition statement
hive> alter table bucketed_table replace columns(firstname string , lastname string);
OK
Time taken: 0.164 seconds
hive> select * from bucketed_table;
OK
bucketed_table.firstname bucketed_table.lastname
3 nikhil
3 nikhil
1 tutu
1 avi
4 tanvi
10 vivek
1 tutu
1 avi
4 tanvi
10 vivek
5 akshay
5 akshay
Time taken: 0.138 seconds, Fetched: 12 row(s)
hive> describe bucketed_table;
OK
col_name data_type comment
firstname string
lastname string
so from above even if you replace the column , if the bucketing is done on that column it will still persist but if that column doesn't have bucketing on it then it will be removed both from metadata and hdfs.