How to specify KeyValueTextInputFormat Separator i

2019-01-16 20:57发布

In new API (apache.hadoop.mapreduce.KeyValueTextInputFormat) , how to specify separator (delimiter) other than tab(which is default) to separate key and Value.

Sample Input :

one,first line
two,second line

Ouput Required :

Key : one
Value : first line
Key : two
Value : second line

I am specifying KeyValueTextInputFormat as :

    Job job = new Job(conf, "Sample");

    job.setInputFormatClass(KeyValueTextInputFormat.class);
    KeyValueTextInputFormat.addInputPath(job, new Path("/home/input.txt"));

This is working fine for tab as a separator.

7条回答
三岁会撩人
2楼-- · 2019-01-16 21:36

Please set the following in the Driver Code.

conf.set("key.value.separator.in.input.line", ",");
查看更多
登录 后发表回答