Exception while executing hadoop job remotely

2019-05-11 15:09发布

I am trying to execute a Hadoop job on a remote hadoop cluster. Below is my code.

Configuration conf = new Configuration();
conf.set("fs.default.name", "hdfs://server:9000/");
conf.set("hadoop.job.ugi", "username");

Job job = new Job(conf, "Percentil Ranking");
job.setJarByClass(PercentileDriver.class);
job.setMapperClass(PercentileMapper.class);
job.setReducerClass(PercentileReducer.class);
job.setMapOutputKeyClass(TestKey.class);
job.setMapOutputValueClass(TestData.class);
job.setOutputKeyClass(TestKey.class);
job.setOutputValueClass(BaselineData.class);

job.setOutputFormatClass(SequenceFileOutputFormat.class);

FileInputFormat.addInputPath(job, new Path(inputPath));

FileOutputFormat.setOutputPath(job, new Path(outputPath));

job.waitForCompletion(true);

When the job starts executing immediately an exception is thrown before even the map phase.

java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:226)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:617)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:453)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:192)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:142)
at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1216)
at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1197)
at org.apache.hadoop.mapred.LocalJobRunner$Job.<init>(LocalJobRunner.java:92)
at org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:373)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:800)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)

The input file does exist and is a comma separated text file. I am able to execute the job on the hadoop cluster using the hadoop jar command with the same input and output. But I can't run it remotely. I am also able to run other jobs remotely.

Can anyone tell me what is the solution to this problem?

2条回答
祖国的老花朵
2楼-- · 2019-05-11 15:38

It seems conf.set("mapred.job.tracker", "server:9001"); fixed the issue. Thanks for your help.

查看更多
时光不老,我们不散
3楼-- · 2019-05-11 15:52

You do this:

conf.set("fs.default.name", "serverurl");

So you are setting the filesystem to the value "serverurl"... which is meaningless.

I'm pretty sure that it works when you simply remove that line from your code.

HTH

查看更多
登录 后发表回答