I am using the confluent to import data from kafka to hive, trying to do the same thing as this: Bucket records based on time(kafka-hdfs-connector)
my sink config is like this:
{
"name":"yangfeiran_hive_sink_9",
"config":{
"connector.class":"io.confluent.connect.hdfs.HdfsSinkConnector",
"topics":"peoplet_people_1000",
"name":"yangfeiran_hive_sink_9",
"tasks.max":"1",
"hdfs.url":"hdfs://master:8020",
"flush.size":"3",
"partitioner.class":"io.confluent.connect.hdfs.partitioner.TimeBasedPartitioner",
"partition.duration.ms":"300000",
"path.format":"'year'=YYYY/'month'=MM/'day'=dd/'hour'=HH/'minute'=mm/",
"locale":"en",
"logs.dir":"/tmp/yangfeiran",
"topics.dir":"/tmp/yangfeiran",
"hive.integration":"true",
"hive.metastore.uris":"thrift://master:9083",
"schema.compatibility":"BACKWARD",
"hive.database":"yangfeiran",
"timezone": "UTC",
}
}
Everything works fine, I can see that data is in the hdfs, the table is created in the hive, except when I am using "select * from yang" to check if the data is already in hive.
It prints the error:
FAILED: SemanticException Unable to determine if hdfs://master:8020/tmp/yangfeiran/peoplet_people_1000 is encrypted: java.lang.IllegalArgumentException: Wrong FS: hdfs://master:8020/tmp/yangfeiran/peoplet_people_1000, expected: hdfs://nsstargate
How to solve this problem?
Feiran