I have 2 tables in Hive.
table1 contains
timelocation
2015-03-04 15:00 Chicago
2015-03-04 15:00 Denver
2015-03-04 15:00 Honolulu
table2 contains
ID Description
America/Chicago CENTRAL STANDARD TIME
America/Denver MOUNTAIN STANDARD TIME
Pacific/Honolulu HAWAII-ALEUTIAN STANDARD TIME
For a record present in table1 say "2015-03-04 15:00 Chicago" , I need to do a lookup in table2 for the corresponding Chicago record. It should read the ID & Description for chicago and return the current Central Standard Time for Chicago i.e "2015-05-04 09:11".
Similarly for Denver it must return Mountain Standard Time & for Honolulu it must return HAWAII-ALEUTIAN STANDARD TIME.
Expected output is
timelocation
2015-05-04 09:11
2015-05-04 08:11
2015-05-04 04:11
How can I do it?
I don't really want to write this query for you but hopefully this will point you in the right direction. You need to join
tbl1 <=> tbl2
. Extract the city from the respective columns in each table;split()
function will help you. Then there is a nifty function you can find here calledfrom_utc_timestamp()
that takes a time stamp (assumed to be UTC) and converts it to a given time zone. You'll also need to convert the columnDescription
to its time zone abbreviation. You can find those here. Good Luck!Try this to_utc_timestamp('2015-01-01 00:00:00','PST')
returns 2015-01-01 08:00:00
You need to join the table1 and table2 using the following query:
Here output of the regex function on let's say
America/Chicago
will be/Chicago
, therefore I am doing a substring on the output to extractChicago
and match to the Substring oftimelocation
column.For more info about the hive regex function
regexp_extract(string subject, string pattern, int index)
check the @Hive Doc