I'm putting together a query that will get refreshed daily that needs to pull records from the last ten dates.
The tables I'm accessing have a 'xxdatetime' column with the unix time stamp and an 'eventdate' column with the date in a yyyy-mm-dd.
In Impala, the answer was easy:
where eventdate > to_date(days_sub(now(), 10))
I used a variation of it in Hive that failed because I guess it was scanning the whole table and the tables are MASSIVE:
where datediff(cast(current_timestamp() as string), eventdate)=10
Is there a light weight way in Hive SQL to filter the xxdatetime or eventdate columns by 'today - 10 days'?
Eventdate is my indexed column, date_sub is (string, days to subtract) and current_date is the current Hive yyyy mm dd.
I'm not sure if this is super light weight but it works!