How to alter Hive partition column name

2019-04-21 23:15发布

I have to change the partition column name (not partition spec), I looked for the commands in hive wiki and some google pages. I can find the options for altering the partition spec,

i.e. For example

In /table/country='US' I can change US to USA, but I want to change country to continent.

I feel like the only option available for changing partition column name is dropping and re-creating the table. Is there is any other option available please help me.

Thanks in advance.

4条回答
小情绪 Triste *
2楼-- · 2019-04-21 23:30

Adding solution here for later:

  • Use case: Change partition column from STRING to INT

    set hive.mapred.mode=norestrict; 
    alter table {table_name} partition column ({column_name} {column_type}); 
    
    e.g. ALTER TABLE employee PARTITION COLUMN dept INT;
    
查看更多
霸刀☆藐视天下
3楼-- · 2019-04-21 23:34

You have alter the partition column using simple swap method.

  • Create a new temp table which is same schema as current table.
  • Move all files in the old table to newly create table location.

    hadoop fs -mv <current_table_name> <temp_table_name>

  • Alter the schema of the original table (Rename or drop the partitions)
  • Recopy/load the temp table data to the original table with appropriate partition values.

    hadoop fs -mv <temp_table_name> <current_table_name>

  • msck repair the the original table & drop the temp_table.

NOTE : mv command move the file from one location to another with reducing the copy time. alternately we can use LOAD DATA INPATH for copy the data to the original table.

查看更多
你好瞎i
4楼-- · 2019-04-21 23:53

As you said, rename the value for of the partition is very straightforward:

hive> ALTER TABLE test.usage PARTITION (country ='US') RENAME TO PARTITION (date='USA');

I know that this is not what you are looking for. Unfortunately, given that your data is already partitioned by country, the only option you have is to drop the table, remove the data (supposing your table is external) from the HDFS and reinsert the data using continent as partition.

What I would do in your case is to have multiple partition levels, so that your folder structure will look like that:

/path/to/the/data/continent='america'/country='usa'
/path/to/the/data/continent='america'/country='mexico'
/path/to/the/data/continent='europe'/country='spain'
/path/to/the/data/continent='europe'/country='italy'
...

That way you can query the data for different levels of granularity (in this case continent and country).

查看更多
你好瞎i
5楼-- · 2019-04-21 23:55

You can change column name in metadata by following: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-ChangeColumnName/Type/Position/Comment

But as the document says, it only changes the metadata. Hive partitions are implemented as directories with the naming pattern columnName=spec. So you also need to change the names of those directories on HDFS by using "hadoop fs" command.

查看更多
登录 后发表回答