How to partition a Hive Table using range of value

2020-03-04 07:06发布

问题:

I have a Hive Table with 2 columns.Employee ID and Salary.

Data is something like given below.

Employee ID Salary
1   10000.08
2   20078.67
3   20056.45
4   30000.76
5   10045.14
6   43567.76

I want to create Partitions based on Salary Column.For Example Partition for salary range 10000 to 20000, 20001 to 30000.

How do i achieve this.

回答1:

Hive does not support range partitioning, but you can calculate ranges during data load.

  1. Create table partitioned by salary_range:

    create table your_table
    (
     employee_id bigint,
     salary double
    )
    partitioned by (salary_range bigint)
    
  2. insert using case for salary range calculation:

    insert overwrite table your_table partition (salary_range)   
    select employee_id, salary,  
           case 
               when salary between 10000 and 20000 then 20000
               when salary between 20001 and 30000 then 30000 
               ...
               else ...
           end as salary_range 
    from some_table;
    


标签: hive hiveql