Running Hadoop MR jobs without Admin privilege on

2020-07-26 13:45发布

问题:

I have installed Hadoop 2.3.0 in windows and able to execute MR jobs successfully. But when I trying to execute MR jobs in normal privilege (without admin privilege) means job get fails with following exception. Here I tried with Pig Script sample.

    2014-10-15 12:02:32,822 WARN [main] org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:kaveen (auth:SIMPLE) cause:java.io.IOException: Split class org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSplit not found
2014-10-15 12:02:32,823 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.io.IOException: Split class org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSplit not found
    at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:362)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:403)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.lang.ClassNotFoundException: Class org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSplit not found
    at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1794)
    at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:360)
    ... 7 more

2014-10-15 12:02:32,827 INFO [main] org.apache.hadoop.mapred.Task: Runnning cleanup for the task
2014-10-15 12:02:32,827 WARN [main] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Output Path is null in abortTask()

Update:

I was able to drill down the problem and found that the exception raised in the following line at method "MapTask.getSplitDetails(MapTask.java:363)".

private <T> T getSplitDetails(Path file, long offset) 
  throws IOException {
   FileSystem fs = file.getFileSystem(conf);
   FSDataInputStream inFile = fs.open(file);
   inFile.seek(offset);
   String className = StringInterner.weakIntern(Text.readString(inFile));
   Class<T> cls;
   try {
     cls = (Class<T>) conf.getClassByName(className);
   } catch (ClassNotFoundException ce) {
     IOException wrap = new IOException("Split class " + className + 
                                         " not found");
     wrap.initCause(ce);

     throw wrap;
   }

But If I start "NodeManager" with admin privilege mean the above exception won't occur. I don't know why MR job not working when I start "NodeManager" with normal privilege.

If anyone know the reason and solution for above problem. Please guide me as soon as possible.

回答1:

You can change the location of tmp directory location for hadoop using the below property

<property>
   <name>hadoop.tmp.dir</name>
   <value>/other/tmp</value>
</property>

Your default tmp location is c:\tmp which requires admin privilege to access. Change the location into any sub directory and try MR job without admin privilege.

Hope it helps.