Where are logs in Spark on YARN?

I'm new to spark. Now I can run spark 0.9.1 on yarn (2.0.0-cdh4.2.1). But there is no log after execution.

The following command is used to run a spark example. But logs are not found in the history server as in a normal MapReduce job.

SPARK_JAR=./assembly/target/scala-2.10/spark-assembly-0.9.1-hadoop2.0.0-cdh4.2.1.jar \
./bin/spark-class org.apache.spark.deploy.yarn.Client --jar ./spark-example-1.0.0.jar \
--class SimpleApp --args yarn-standalone  --num-workers 3 --master-memory 1g \
--worker-memory 1g --worker-cores 1

where can I find the logs/stderr/stdout?

Is there someplace to set the configuration? I did find an output from console saying:

14/04/14 18:51:52 INFO Client: Command for the ApplicationMaster: $JAVA_HOME/bin/java -server -Xmx640m -Djava.io.tmpdir=$PWD/tmp org.apache.spark.deploy.yarn.ApplicationMaster --class SimpleApp --jar ./spark-example-1.0.0.jar --args 'yarn-standalone' --worker-memory 1024 --worker-cores 1 --num-workers 3 1> <LOG_DIR>/stdout 2> <LOG_DIR>/stderr

In this line, notice 1> $LOG_DIR/stdout 2> $LOG_DIR/stderr

Where can LOG_DIR be set?

标签： hadoop logging apache-spark Cloudera yarn

4条回答

Explosion°爆炸

2楼-- · 2019-01-17 10:25

None of the answers make it crystal clear where to look for logs ( although they do in pieces) so I am putting it together.

If log aggregation is turned on (with the yarn.log-aggregation-enable yarn-site.xml) then do this

yarn logs -applicationId <app ID>

However, if this is not turned on then one needs to go on the Data-Node machine and look at

$HADOOP_HOME/logs/userlogs/application_1474886780074_XXXX/

application_1474886780074_XXXX is the application id

0人赞添加讨论(0) 举报

姐就是有狂的资本

3楼-- · 2019-01-17 10:26

Pretty article for this question:

Running Spark on YARN - see the section "Debugging your Application". Decent explanation with all required examples.

The only thing you need to follow to get correctly working history server for Spark is to close your Spark context in your application. Otherwise, application history server does not see you as COMPLETE and does not show anything (despite history UI is accessible but not so visible).

0人赞添加讨论(0) 举报

爱情/是我丢掉的垃圾

4楼-- · 2019-01-17 10:33

You can access logs through the command

yarn logs -applicationId <application ID> [OPTIONS]

general options are:

appOwner <Application Owner> - AppOwner (assumed to be current user if not specified)
containerId <Container ID> - ContainerId (must be specified if node address is specified)
nodeAddress <Node Address> - NodeAddress in the format nodename:port (must be specified if container id is specified)

Examples:

yarn logs -applicationId application_1414530900704_0003                                      
yarn logs -applicationId application_1414530900704_0003 myuserid

// the user ids are different
yarn logs -applicationId <appid> --appOwner <userid>

0人赞添加讨论(0) 举报

smile是对你的礼貌

5楼-- · 2019-01-17 10:40

It logs to:

/var/log/hadoop-yarn/containers/[application id]/[container id]/stdout

The logs are on every node that your Spark job runs on.

0人赞添加讨论(0) 举报

Where are logs in Spark on YARN?

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间