confluent kafka hdfs connector hive

2019-09-06 03:13发布

问题:

I am using the confluent to import data from kafka to hive, trying to do the same thing as this: Bucket records based on time(kafka-hdfs-connector)

my sink config is like this:

{
    "name":"yangfeiran_hive_sink_9",
    "config":{
        "connector.class":"io.confluent.connect.hdfs.HdfsSinkConnector",
        "topics":"peoplet_people_1000",
        "name":"yangfeiran_hive_sink_9",
        "tasks.max":"1",
        "hdfs.url":"hdfs://master:8020",
        "flush.size":"3",
        "partitioner.class":"io.confluent.connect.hdfs.partitioner.TimeBasedPartitioner",
        "partition.duration.ms":"300000",
        "path.format":"'year'=YYYY/'month'=MM/'day'=dd/'hour'=HH/'minute'=mm/",
        "locale":"en",
        "logs.dir":"/tmp/yangfeiran",
        "topics.dir":"/tmp/yangfeiran",
        "hive.integration":"true",
        "hive.metastore.uris":"thrift://master:9083",
        "schema.compatibility":"BACKWARD",
        "hive.database":"yangfeiran",
        "timezone": "UTC",
    }
}

Everything works fine, I can see that data is in the hdfs, the table is created in the hive, except when I am using "select * from yang" to check if the data is already in hive.

It prints the error:

FAILED: SemanticException Unable to determine if hdfs://master:8020/tmp/yangfeiran/peoplet_people_1000 is encrypted: java.lang.IllegalArgumentException: Wrong FS: hdfs://master:8020/tmp/yangfeiran/peoplet_people_1000, expected: hdfs://nsstargate

How to solve this problem?

Feiran