I have a to insert data into a target table where all columns should be populated from different source tables except the surrogate key column; which should be maximum value of the target table plus auto increment value starting 1. I can generate auto increment value by using row_number() function, but in the same query how should I get the max value of surrogate key from target table. Is there any concept in HIVE where I can select the max value of surrogate key and save it in a temporary variable? Or is there any other simple way to achieve this result?
相关问题
-
hive: cast array
> into map - Find function in HIVE
- Hive Tez reducers are running super slow
- Set parquet snappy output file size is hive?
- Hive 'cannot alter table' error
相关文章
- 在hive sql里怎么把"2020-10-26T08:41:19.000Z"这个字符串转换成年月日
- SQL query Frequency Distribution matrix for produc
- Cloudera 5.6: Parquet does not support date. See H
- converting to timestamp with time zone failed on A
- Hive error: parseexception missing EOF
- ClassNotFoundException: org.apache.spark.SparkConf
- How to get previous day date in Hive
- Hive's hour() function returns 12 hour clock v
Here are two approaches which worked for me for the above problem. ( explained with example)
Approach 1: getting the max and setting to hive commands through ${hiveconf} variable using shell script
Approach 2: using row_sequence(), max() and join operations
My Environment:
Steps: (note: step 1 and step 2 are common for both approaches. Starting from step 3 , it differs for both)
Step 1: create source and target tables
source
target
Step 2: load data into source tables
Sample Input:
source_table1.txt
source_table2.txt
source_table3.txt
Approach 1:
Step 3: create a shell script hive_auto_increment.sh
Step 4: run the shell script
Approach 2:
Step 3: Add Jar
Step 4: register row_sequence function with help of hive contrib jar
Step 5: load the source_table1 to target_table
Step 6: load the other sources to target_table
output:
create table autoincrement1 ( id int, name string)
insert into autoincrement1
select if(isnull(max(id)) ,0 , max(id) ) +1, 'sagar' from autoincrement1