Can't read data in Presto - can in Hive

2019-06-24 15:40发布

I have a Hive DB - I created a table, compatible to Parquet file type.

CREATE EXTERNAL TABLE `default.table`(
  `date` date,
  `udid` string,
  `message_token` string)
PARTITIONED BY (
  `dt` date)
ROW FORMAT SERDE
  'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
STORED AS INPUTFORMAT
  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
LOCATION
  's3://Bucket/Folder')

I added partitions to this table, but I can't query the data.

In Hive: I can see the partitions when using "Show partitions from default.table", and I get the number of queries when using "Select count(*) from default.table".

In Presto: I can see the partitions when using "Show partitions from default.table", but when I try to query the data itself - it looks like there's no data - empty return with "select *", and 0 when trying "select count(*)".

Hive cluster is AWS EMR, version: emr-5.9.0, Applications: Hive 2.3.0, Presto 0.184, instance type: r3.2xlarge.

Does someone know why I get these differences between Hive and Presto? Thanks!

0条回答
登录 后发表回答