NullPointerException with MR2 in windows

2019-09-07 19:47发布

问题:

I have installed Hadoop 2.3.0 in windows and able to execute MR jobs successfully. But while trying with streaming sample in C# [with HadoopSDK's .Net assemblies] the app ends with the following exception

14/05/16 18:21:06 INFO mapreduce.Job: Task Id : attempt_1400239892040_0003_r_000000_0, Status : FAILED
Error: java.lang.NullPointerException
at org.apache.hadoop.mapred.Task.getFsStatistics(Task.java:347)
at org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.<init>(ReduceTask.java:478)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:414)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)

Update:

I was able to drill down the problem and found that the exception raised in the following line

 matchedStats = getFsStatistics(FileOutputFormat.getOutputPath(job), job);

at org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.<init>(ReduceTask.java:478)

In the above, the result of 'FileOutputFormat.getOutputPath(job)' returns null, which throws the null pointer exception. Below are the codes for the getOutputPath() function.

public static final String OUTDIR = "mapreduce.output.fileoutputformat.outputdir";

  public static Path getOutputPath(JobConf conf) {
  String name = conf.get(org.apache.hadoop.mapreduce.lib.output.
  FileOutputFormat.OUTDIR);
  return name == null ? null: new Path(name);
  }

So Is it needed to set value for property "mapreduce.output.fileoutputformat.outputdir" anywhere in the configuration files to get this issue resolved?

Thanks

回答1:

The problem is that the hadoop services have been started from a different user [SYSTEM in my case] and the mapreduce sample was submitted from my local user. So this makes the issue by returning the FileSystem statistics [for the local user] as Null.

Once i started Hadoop from my local user, the above issue gets resolved.