How to record created_at and updated_at timestamps

2019-01-28 22:11发布

MySQL can automatically record created_at and updated_at timestamps. Does Hive provide similar mechanisms? If not, what would be the best way to achieve this functionality?

标签： sql hive hiveql

1条回答

你好瞎i

2楼-- · 2019-01-28 22:58

Hive does not provide such mechanism. You can achieve this by using UDF in your select: from_unixtime(unix_timestamp()) as created_at. Note this will be executed in each mapper or reducer and may return different values. If you need the same value for all the dataset (for Hive version before 1.2.0), pass the variable to the script and use it inside as: '${hiveconf:created_at}' as created_at

Update: current_timestamp returns the current timestamp at the start of query evaluation (as of Hive 1.2.0). All calls of current_timestamp within the same query return the same value. unix_timestamp() Gets current Unix timestamp in seconds. This function is non-deterministic and prevents proper optimization of queries - this has been deprecated since 2.0 in favour of CURRENT_TIMESTAMP constant. So, it's not a function, it's a constant! See this docs: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF

For hive queries CURRENT_TIMESTAMP is preferable when you rewrite tables or partitions or insert into because all the file(s) anyway are being rewritten, not records, therefore the created_at timestamp should be the same.

0人赞添加讨论(0) 举报

How to record created_at and updated_at timestamps

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间