PipeMapRed.waitOutputThreads(): subprocess failed

2019-07-22 15:08发布

问题:

I am getting below error when i ran a map reduce job. My input is a sequencefile of 26MB.

16/02/28 17:51:22 INFO mapreduce.Job: Task Id : attempt_1456551797554_0004_m_000000_0, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
        at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322)
        at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535)
        at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
        at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)       
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

My stderr output is:

log4j:WARN No appenders could be found for logger (org.apache.hadoop.ipc.Server).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

Note: The mapper ran perfectly fine when i executed it locally but gave the error when i ran the job in hadoop. The only difference is that in locally the file i used was at my home directory and in hadoop job i used the same file but at hdfs location.

I have mentioned '#!/usr/bin/env python' at the top of my mapper.py