可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I have orc table in hive I want to drop column from this table

ALTER TABLE table_name drop  col_name;

but I am getting the following exception

Error occurred executing hive query: OK FAILED: ParseException line 1:35 mismatched input 'user_id1' expecting PARTITION near 'drop' in drop partition statement

Can any one help me or provide any idea to do this? Note, I am using hive 0.14

回答1:

You cannot drop column directly from a table using command ALTER TABLE table_name drop col_name;

The only way to drop column is using replace command. Lets say, I have a table emp with id, name and dept column. I want to drop id column of table emp. So provide all those columns which you want to be the part of table in replace columns clause. Below command will drop id column from emp table.

 ALTER TABLE emp REPLACE COLUMNS( name string, dept string);

回答2:

There is also a "dumb" way of achieving the end goal, is to create a new table without the column(s) not wanted. Using Hive's regex matching will make this rather easy.

Here is what I would do:

-- make a copy of the old table
ALTER TABLE table RENAME TO table_to_dump;

-- make the new table without the columns to be deleted
CREATE TABLE table AS
SELECT `(col_to_remove_1|col_to_remove_2)?+.+`
FROM table_to_dump;

-- dump the table 
DROP TABLE table_to_dump;

If the table in question is not too big, this should work just well.

回答3:

suppose you have an external table viz. organization.employee as: (not including TBLPROPERTIES)

hive> show create table organization.employee;
OK
CREATE EXTERNAL TABLE `organization.employee`(
      `employee_id` bigint,
      `employee_name` string,
      `updated_by` string,
      `updated_date` timestamp)
    ROW FORMAT SERDE
      'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
    STORED AS INPUTFORMAT
      'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
    OUTPUTFORMAT
      'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
    LOCATION
      'hdfs://getnamenode/apps/hive/warehouse/organization.db/employee'

You want to remove updated_by, updated_date columns from the table. Follow these steps:

create a temp table replica of organization.employee as:

hive> create table organization.employee_temp as select * from organization.employee;

drop the main table organization.employee.

hive> drop table organization.employee;

remove the underlying data from HDFS (need to come out of hive shell)

[nameet@ip-80-108-1-111 myfile]$ hadoop fs -rm hdfs://getnamenode/apps/hive/warehouse/organization.db/employee/*

create the table with removed columns as required:

hive> CREATE EXTERNAL TABLE `organization.employee`(
  `employee_id` bigint,
  `employee_name` string)
ROW FORMAT SERDE
  'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
STORED AS INPUTFORMAT
  'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
LOCATION
  'hdfs://getnamenode/apps/hive/warehouse/organization.db/employee'

insert the original records back into original table.

hive> insert into organization.employee 
select employee_id, employee_name from organization.employee_temp;

finally drop the temp table created

hive> drop table organization.employee_temp;

回答4:

ALTER TABLE emp REPLACE COLUMNS( name string, dept string);

Above statement can only change the schema of a table, not data. A solution of this problem to copy data in a new table.

Insert <New Table> Select <selective columns> from <Old Table>

回答5:

ALTER TABLE is not yet supported for non-native tables; i.e. what you get with CREATE TABLE when a STORED BY clause is specified.

check this https://cwiki.apache.org/confluence/display/Hive/StorageHandlers

回答6:

Even below query is working for me.

Alter table tbl_name drop col_name

Alter hive table add or drop column

问题:

回答1:

回答2:

回答3:

回答4:

回答5:

回答6:

收藏的人(0)

Alter hive table add or drop column

问题:

回答1:

回答2:

回答3:

回答4:

回答5:

回答6:

收藏的人(0)

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮