I've had little luck searching for this over a couple days.
If my avro schema for data in a hive table is:
{
"type" : "record",
"name" : "messages",
"namespace" : "com.company.messages",
"fields" : [ {
"name" : "timeStamp",
"type" : "long",
"logicalType" : "timestamp-millis"
}, {
…
and I use presto to query this, I do not get formatted timestamps.
select "timestamp", typeof("timestamp") as type,
current_timestamp as "current_timestamp", typeof(current_timestamp) as current_type
from db.messages limit 1
timestamp type current_timestamp current_type 1497210701839 bigint 2017-06-14 09:32:43.098 Asia/Seoul timestamp with time zone
I thought it would be a non-issue then to convert them to timestamps with millisecond precision, but I'm finding I have no clear way to do that.
select cast("timestamp" as timestamp) from db.messages limit 1
line 1:16: Cannot cast bigint to timestamp
Also they've changed presto's timestamp casting to always assume the source is in seconds. https://issues.apache.org/jira/browse/HIVE-3454
So if I used from_unixtime()
I have to chop off the milliseconds or else it gives me a very distant date:
select from_unixtime("timestamp") as "timestamp" from db.messages limit 1
timestamp +49414-08-06 07:15:35.000
Surely someone else who works with Presto more often knows how to express the conversion properly. (I can't restart the Presto nor Hive servers to force the timezone into UTC either btw).
I didn't find direct conversion from Java timestamp (number of milliseconds since 1970) to timestamp, but one can be done with
to_unixtime
and adding milliseconds as interval:(admittedly cumbersome, but works)
select from_unixtime(cast(event_time as bigint) / 1000000) + parse_duration(cast((cast(event_time as bigint) % 1000) as varchar) || 'ms') from TableName limit 10;