I have 2 tables in Hive.
table1 contains
timelocation
2015-03-04 15:00 Chicago
2015-03-04 15:00 Denver
2015-03-04 15:00 Honolulu
table2 contains
ID Description
America/Chicago CENTRAL STANDARD TIME
America/Denver MOUNTAIN STANDARD TIME
Pacific/Honolulu HAWAII-ALEUTIAN STANDARD TIME
For a record present in table1 say "2015-03-04 15:00 Chicago" , I need to do a lookup in table2 for the corresponding Chicago record. It should read the ID & Description for chicago and return the current Central Standard Time for Chicago i.e "2015-05-04 09:11".
Similarly for Denver it must return Mountain Standard Time & for Honolulu it must return HAWAII-ALEUTIAN STANDARD TIME.
Expected output is
timelocation
2015-05-04 09:11
2015-05-04 08:11
2015-05-04 04:11
How can I do it?
I don't really want to write this query for you but hopefully this will point you in the right direction. You need to join tbl1 <=> tbl2
. Extract the city from the respective columns in each table; split()
function will help you. Then there is a nifty function you can find here called from_utc_timestamp()
that takes a time stamp (assumed to be UTC) and converts it to a given time zone. You'll also need to convert the column Description
to its time zone abbreviation. You can find those here. Good Luck!
You need to join the table1 and table2 using the following query:
SELECT T1.timelocation,T2.ID, T2.Description, to_utc_timestamp(SUBSTRING(T1.timelocation,1,16),T2.ID) AS newtime
FROM table 1 T1 INNER JOIN table2 T2 ON SUBSTRING(T1.timelocation,17)= SUBSTRING(regexp_extract(T2.ID,'/\w*',2),2)
Here output of the regex function on let's say America/Chicago
will be /Chicago
, therefore I am doing a substring on the output to extract Chicago
and match to the Substring of timelocation
column.
For more info about the hive regex function regexp_extract(string subject, string pattern, int index)
check the @Hive Doc
Try this
to_utc_timestamp('2015-01-01 00:00:00','PST')
returns 2015-01-01 08:00:00