In my dataset I have dates in Unix timestamps. I want to convert these to a datetime in Apache Pig. For this I can use the ToDate()
function as described here. However I know my Unix timestamps to be in GMT / UTC, but converting using ToDate()
will result in my local timezone. I don't see how I can specify the timezone in this function when converting from a Unix timestamp. I don't want to manually adjust the datetime after conversion, because this is a huge pain with daylight savings time. Hopefully someone has a good suggestion, every help will be appreciated.
Here an example:
ToString( ToDate( (long)'1417145524000'), 'yyyy-MM-dd hh:mm:ss' )
results in (2014-11-28 04:04:32) which is the time in CET, however I want this to be (2014-11-28 03:04:32) in GMT.
Just a side note, that Pig's ToDate uses offset-only timezones - how many hours + or - of GMT. Not the geographical ones. You can run into troubles with daylight saving times in this way. Consider computing time differences in hours since midnight:
2015-03-29 03:00:00+0200 minus 2015-03-29 00:00:00+0100 is 4 hours
but2015-03-29 03:00:00+0200 (Europe/Prague) minus 2015-03-29 00:00:00+0100 (Europe/Prague) is 3 hours.
With Pig's ToDate, you can only achieve the former behavior.
This is what you're looking for:
https://pig.apache.org/docs/r0.11.1/func.html#to-date
Timezone stings: http://joda-time.sourceforge.net/timezones.html
After Edwin's comment:
In this specific case you can to do this: