Yarn mini-cluster container log directories don

2019-07-16 08:56发布

问题:

I have setup YARN MapReduce mini-cluster with 1 node manager, 4 local and 4 log directories and so on based on hadoop 2.3.0 from CDH 5.1.0. It looks more or less working. What I failed to achieve is syslog logging from containers. I see container log directories, stdout and stderr files but not syslog with MapReduce container logging. Appropriate stderr warns I have no log4j configuration and contains no any other string:

log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.impl.MetricsSystemImpl). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

How can I add normal logging to my containers? Yet another time, it is YARN mini-cluaster.

Any piece of advice or useful point?

Just to lower amount of definitely tried ways as answers:

  • Yes, I'm sure logging directories are correct and I see correlation between container log directories and my applications.
  • Yes, MapReduce jobs work. At least those who are expected to work.
  • Mini-cluster logging itself is at normal way and in accordance to what I have setup. This is only related to containers.
  • Lower layers like DFS clsuter works normally. I have even HBase and ZK mini-clusters here and they work OK. Just I need logging for MapReduce jobs debugging.

回答1:

OK, finally happened to be about classpath, client configuration and packaging.

  1. Client configuration SHALL include proper classpath for YARN applications. In my case I have added the following lines to yarn-site.xml (please note $HADOOP_COMMON_HOME substitution):
<property>
    <name>yarn.application.classpath</name>
    <value>$HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*</value>
</property>
  1. I have added the following variable definition to mini-cluster start-up script (it worth to note I have all mini-cluster server-side JARS into ./lib relatively to mini-cluster startup script:

BASE_PATH="pwd" export HADOOP_COMMON_HOME=${BASE_PATH}

The root cause of not working logging was client map-reduce job starting inside new VM on YARN without knowledge where to locate hadoop-yarn-server-nodemanager.jar which contains container-log4j.properties file which is in turn responsible for container default logging configuration. Now everything is working fine.