YARN Application exited with exitCode: -1000 Not a

2020-07-11 04:50发布

问题:

I am getting:

Application application_1427711869990_0001 failed 2 times due to AM Container for appattempt_1427711869990_0001_000002 exited with exitCode: -1000 due to: Not able to initialize user directories in any of the configured local directories for user kailash
.Failing this attempt.. Failing the application.

I couldn`t find anything related to this exit code and the associated reason. I am using Hadoop 2.5.0 (Cloudera 5.3.2).

回答1:

Actually this is due to the permission issues on some of the yarn local directories. I started using LinuxContainerExecutor (in non secure mode with nonsecure-mode.local-user as kailash) and made corresponding changes. However due to some (unknown) reason NodeManager failed to clean local directories for users, and there still existed directories with previous user (in my case yarn).

So to solve this, I first had to find the value of the property yarn.nodemanager.local-dirs (with Cloudera use search option to find this property for YARN service, otherwise look into yarn-site.xml in hadoop conf directory), and then delate the files/directories under usercache for all the node manager nodes. In my case, I used:

rm -rf /yarn/nm/usercache/*


回答2:

Just in case if someone is lost with usercache location. If you don't have the yarn.nodemanager.local-dirs configured anywhere, look for it in the default location ${hadoop.tmp.dir}/nm-local-dir. Again, if the hadoop.tmp.dir is not configured in the core-site.xml, it will be under /tmp/hadoop-${user.name}. The user.name is the UNIX user that you are using to run the current Hadoop process. All the configuration file are under $HADOOP_INSTALL/etc/hadoop/ by default.



回答3:

You need to apply this command

rm -rf /dn/yarn/nm/usercache/* { this is my configuration }

Please check you configuration inside YARN (MR2 Included) NodeManager Local Directories .

http://i.imgur.com/BHwhUnB.jpg

You need to apply this to data nodes which reported error by the YARN

This is sample of my case

http://i.imgur.com/miNx454.jpg

ApplicationMaster reported that C90BFH04.localdomain:8042 , which is data node no 4 . So i applied only to the YARN directory in Node no 4

After that everything is OK !



回答4:

when i do the test of spark-submit-on-yarn and actor with cluster mode:

spark-submit --master yarn --deploy-mode cluster --class org.apache.spark.examples.SparkPi /usr/local/install/spark-2.2.0-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.2.0.jar 100 As i am the same error:

Application application_1532249549503_0007 failed 2 times due to AM Container for appattempt_1532249549503_0007_000002 exited with exitCode: -1000 Failing this attempt.Diagnostics: java.io.IOException: Resource file:/usr/local/install/spark-2.2.0-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.2.0.jar changed on src filesystem (expected 1531576498000, was 1531576511000

fianlly,i fixed the error with set the property fs.defaultFS in the the $HADOOP_HOME/etc/hadoop/core-site.xml



标签: Cloudera yarn