Part of Filename as a column in Hive Table

2019-08-10 20:08发布

I want to get the first part of my filename as a column in my Hive Table

My filename is : 20151102114450.46400_Always_1446482638967.xml

I wrote a query (below query) using regex in Hive of Microsoft Azure to get the first part of it i.e., 20151102114450

But when I run query I am getting the output as 20151102164358

select CAST(regexp_replace(regexp_replace(regexp_replace(CAST(CAST(regexp_replace(split(INPUT__FILE__NAME,'[_]')[2],'.xml','') AS BIGINT) as TimeStamp),':',''),'-',''),' ','') AS BIGINT) as VERSION

Can anyone tell me where I am going wrong and what needs to be corrected ?

1条回答
ゆ 、 Hurt°
2楼-- · 2019-08-10 21:12

I tried this in Cloudera, hopefully it should work in Azure as well.

select from_unixtime(unix_timestamp(regexp_extract('20151102114450.46400_Always_1446482638967.xml','^(.*?)\\.'),'yyyyMMddHHmmss'),'yyyy-MMM-dd HH:mm:ss');

2015-Nov-02 11:44:50
Time taken: 19.644 seconds, Fetched: 1 row(s)

Another option:

select from_unixtime(unix_timestamp(split('20151102114450.46400_Always_1446482638967.xml','\\.')[0],'yyyyMMddHHmmss'),'yyyy-MMM-dd HH:mm:ss')
查看更多
登录 后发表回答