Specify timezone in ToDate(unix) in Pig

2020-03-30 08:13发布

In my dataset I have dates in Unix timestamps. I want to convert these to a datetime in Apache Pig. For this I can use the ToDate() function as described here. However I know my Unix timestamps to be in GMT / UTC, but converting using ToDate() will result in my local timezone. I don't see how I can specify the timezone in this function when converting from a Unix timestamp. I don't want to manually adjust the datetime after conversion, because this is a huge pain with daylight savings time. Hopefully someone has a good suggestion, every help will be appreciated.

Here an example:

ToString( ToDate( (long)'1417145524000'), 'yyyy-MM-dd hh:mm:ss' )

results in (2014-11-28 04:04:32) which is the time in CET, however I want this to be (2014-11-28 03:04:32) in GMT.

2条回答
不美不萌又怎样
2楼-- · 2020-03-30 08:37

Just a side note, that Pig's ToDate uses offset-only timezones - how many hours + or - of GMT. Not the geographical ones. You can run into troubles with daylight saving times in this way. Consider computing time differences in hours since midnight: 2015-03-29 03:00:00+0200 minus 2015-03-29 00:00:00+0100 is 4 hours but 2015-03-29 03:00:00+0200 (Europe/Prague) minus 2015-03-29 00:00:00+0100 (Europe/Prague) is 3 hours.

With Pig's ToDate, you can only achieve the former behavior.

查看更多
We Are One
3楼-- · 2020-03-30 08:38

This is what you're looking for:

ToDate(userstring, format, timezone)

https://pig.apache.org/docs/r0.11.1/func.html#to-date

Timezone stings: http://joda-time.sourceforge.net/timezones.html

After Edwin's comment:

In this specific case you can to do this:

ToDate(ToString(ToDate((long) ts), 'yyyy-MM-dd hh:ss:mm'), 'yyyy-MM-dd hh:ss:mm', 'timezone')
查看更多
登录 后发表回答