I have enabled logs in the xml file: yarn-site.xml
, and I restarted yarn
by doing:
sudo service hadoop-yarn-resourcemanager restart
sudo service hadoop-yarn-nodemanager restart
I ran my application, and then I see the applicationID
in yarn application -list
. So, I do this: yarn logs -applicationId <application ID>
, and I get the following:
hdfs://<ip address>/var/log/hadoop-yarn/path/to/application/ does not have any log files
Do I need to change some other configuration? Or am I accessing the logs the wrong way?
Thank you.
will list only the applications that are either in SUBMITTED, ACCEPTED or RUNNING state.
Log aggregation collects each container's logs and moves these logs onto the directory configured in
yarn.nodemanager.remote-app-log-dir
only after the completion of the application. Refer the description ofyarn.log-aggregation-enable
property here.So, the
applicationId
listed by the command isn't completed yet and the logs are not yet collected. Thus the response when trying to access the logs of a running applicationYou can try the same command
yarn logs -applicationId <application ID>
to view the logs once the application has completed.To list all the FINISHED applications, use
Or to list all the applications
Enable Log Aggregation
Log aggregation is enabled in the
yarn-site.xml
file. The yarn.log-aggregation-enable property enables log aggregation for running applications.It was probably saved with another appOwner. You can try to specify the application owner in your command:
yarn logs -appOwner .. -application_id ..
In version 2.3.2 of hadoop and higher you can get log aggregation to occur hourly on running jobs using this configuration in yarn-site.xml:
See this for further details: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.2/bk_yarn_resource_mgt/content/ref-375ff479-e530-46d8-9f96-8b52dadb5183.1.html