Cloudera 5.6: Parquet does not support date. See H

2020-08-22 07:09发布

问题:

I am currently using Cloudera 5.6 trying to create a parquet format table in hive table based off another table, but I am running into an error.

create table sfdc_opportunities_sandbox_parquet like 
sfdc_opportunities_sandbox STORED AS PARQUET

Error Message

Parquet does not support date. See HIVE-6384

I read that hive 1.2 has a fix for this issue, but Cloudera 5.6 and 5.7 do not come with hive 1.2. Has anyone found way around this issue?

回答1:

Except from using an other data type like TIMESTAMP or an other storage format like ORC, there might be no way around if there is a dependency to the used Hive version and Parquet file storage format.

According Clouderas CDH 5 Packaging and Tarball Information, the whole branch 5 comes packed with Apache Parquet in v1.5.0 and Apache Hive in v1.1.0.

Date was implemented in ParquetSerde with HIVE-8119 and as of Hive 1.2.