Application report for application_ (state: ACCEPT

2019-01-13 15:06发布

I am running kinesis plus spark application https://spark.apache.org/docs/1.2.0/streaming-kinesis-integration.html

I am running as below

command on ec2 instance :

 ./spark/bin/spark-submit --class org.apache.spark.examples.streaming.myclassname --master yarn-cluster --num-executors 2 --driver-memory 1g --executor-memory 1g --executor-cores 1  /home/hadoop/test.jar 

I have installed spark on EMR.

EMR details
Master instance group - 1   Running MASTER  m1.medium   
1

Core instance group - 2 Running CORE    m1.medium

I am getting below INFO and it never ends.

15/06/14 11:33:23 INFO yarn.Client: Requesting a new application from cluster with 2 NodeManagers
15/06/14 11:33:23 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (2048 MB per container)
15/06/14 11:33:23 INFO yarn.Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead
15/06/14 11:33:23 INFO yarn.Client: Setting up container launch context for our AM
15/06/14 11:33:23 INFO yarn.Client: Preparing resources for our AM container
15/06/14 11:33:24 INFO yarn.Client: Uploading resource file:/home/hadoop/.versions/spark-1.3.1.e/lib/spark-assembly-1.3.1-hadoop2.4.0.jar -> hdfs://172.31.13.68:9000/user/hadoop/.sparkStaging/application_1434263747091_0023/spark-assembly-1.3.1-hadoop2.4.0.jar
15/06/14 11:33:29 INFO yarn.Client: Uploading resource file:/home/hadoop/test.jar -> hdfs://172.31.13.68:9000/user/hadoop/.sparkStaging/application_1434263747091_0023/test.jar
15/06/14 11:33:31 INFO yarn.Client: Setting up the launch environment for our AM container
15/06/14 11:33:31 INFO spark.SecurityManager: Changing view acls to: hadoop
15/06/14 11:33:31 INFO spark.SecurityManager: Changing modify acls to: hadoop
15/06/14 11:33:31 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoop); users with modify permissions: Set(hadoop)
15/06/14 11:33:31 INFO yarn.Client: Submitting application 23 to ResourceManager
15/06/14 11:33:31 INFO impl.YarnClientImpl: Submitted application application_1434263747091_0023
15/06/14 11:33:32 INFO yarn.Client: Application report for application_1434263747091_0023 (state: ACCEPTED)
15/06/14 11:33:32 INFO yarn.Client:
         client token: N/A
         diagnostics: N/A
         ApplicationMaster host: N/A
         ApplicationMaster RPC port: -1
         queue: default
         start time: 1434281611893
         final status: UNDEFINED
         tracking URL: http://172.31.13.68:9046/proxy/application_1434263747091_0023/
         user: hadoop
15/06/14 11:33:33 INFO yarn.Client: Application report for application_1434263747091_0023 (state: ACCEPTED)
15/06/14 11:33:34 INFO yarn.Client: Application report for application_1434263747091_0023 (state: ACCEPTED)
15/06/14 11:33:35 INFO yarn.Client: Application report for application_1434263747091_0023 (state: ACCEPTED)
15/06/14 11:33:36 INFO yarn.Client: Application report for application_1434263747091_0023 (state: ACCEPTED)
15/06/14 11:33:37 INFO yarn.Client: Application report for application_1434263747091_0023 (state: ACCEPTED)
15/06/14 11:33:38 INFO yarn.Client: Application report for application_1434263747091_0023 (state: ACCEPTED)
15/06/14 11:33:39 INFO yarn.Client: Application report for application_1434263747091_0023 (state: ACCEPTED)
15/06/14 11:33:40 INFO yarn.Client: Application report for application_1434263747091_0023 (state: ACCEPTED)
15/06/14 11:33:41 INFO yarn.Client: Application report for application_1434263747091_0023 (state: ACCEPTED)

Could somebody please let me know as why it's not working ?

12条回答
Animai°情兽
2楼-- · 2019-01-13 15:15

There are three ways we can try to fix this issue.

  1. Check for spark process on your machine and kill it.

Do

ps aux | grep spark

Take all the process id's with spark processes and kill them, like

sudo kill -9 4567 7865
  1. Check for number of spark applications running on your cluster.

To check this, do

yarn application -list

you will get an output similar to this:

Total number of applications (application-types: [] and states: [SUBMITTED, ACCEPTED, RUNNING]):1
                Application-Id      Application-Name        Application-Type          User       Queue               State         Final-State         Progress                        Tracking-URL
application_1496703976885_00567       ta da                SPARK        cloudera       default             RUNNING           UNDEFINED              20%             http://10.0.52.156:9090

Check for the application id's, if they are more than 1, or more than 2, kill them. Your cluster cannot run more than 2 spark applications at the same time. I am not 100% sure about this, but on cluster if you run more than two spark applications, it will start complaining. So, kill them Do this to kill them:

yarn application -kill application_1496703976885_00567
  1. Check for your spark config parameters. For example, if you have set more executor memory or driver memory or number of executors on your spark application that may also cause an issue. So, reduce of any of them and run your spark application, that might resolve it.
查看更多
叛逆
3楼-- · 2019-01-13 15:19

I got this error in this situation:

  1. MASTER=yarn (or yarn-client)
  2. spark-submit runs on a computer outside of the cluster and there is no route from the cluster to it because it's hidden by a router

Logs for container_1453825604297_0001_02_000001 (from ResourceManager web UI):

16/01/26 08:30:38 INFO yarn.ApplicationMaster: Waiting for Spark driver to be reachable.
16/01/26 08:31:41 ERROR yarn.ApplicationMaster: Failed to connect to driver at 192.168.1.180:33074, retrying ...
16/01/26 08:32:44 ERROR yarn.ApplicationMaster: Failed to connect to driver at 192.168.1.180:33074, retrying ...
16/01/26 08:32:45 ERROR yarn.ApplicationMaster: Uncaught exception: 
org.apache.spark.SparkException: Failed to connect to driver!
    at org.apache.spark.deploy.yarn.ApplicationMaster.waitForSparkDriver(ApplicationMaster.scala:484) 

I workaround it by using yarn cluster mode: MASTER=yarn-cluster.

On another computer which is configured in the similar way, but is's IP is reachable from the cluster, both yarn-client and yarn-cluster work.

Others may encounter this error for different reasons, and my point is that checking error logs (not seen from terminal, but ResourceManager web UI in this case) almost always helps.

查看更多
男人必须洒脱
4楼-- · 2019-01-13 15:23

I had the same problem on a local hadoop cluster with spark 1.4 and yarn, trying to run spark-shell. It had more then enough resources.

What helped was running the same thing from an interactive lsf job on the cluster. So perhaps there were some network limitations to run yarn from the head node...

查看更多
Anthone
5楼-- · 2019-01-13 15:27

I had this exact problem when multiple users were trying to run on our cluster at once. The fix was to change setting of the scheduler.

In the file /etc/hadoop/conf/capacity-scheduler.xml we changed the property yarn.scheduler.capacity.maximum-am-resource-percent from 0.1 to 0.5.

Changing this setting increases the fraction of the resources that is made available to be allocated to application masters, increasing the number of masters possible to run at once and hence increasing the number of possible concurrent applications.

查看更多
我命由我不由天
6楼-- · 2019-01-13 15:35

I hit the same problem MS Azure cluster in their HDinsight spark cluster.
finally found out the issue was the cluster couldn't be able to talk back to the driver. I assume you used client mode when submit the job since you can provide this debug log.

reason why is that spark executors have to talk to driver program, and the TCP connection has to be bi-directional. so if your driver program is running in a VM(ec2 instance) which is not reachable via hostname or IP(you have to specify in spark conf, default to hostname), your status will be accepted forever.

查看更多
等我变得足够好
7楼-- · 2019-01-13 15:35

When running with yarn-cluster all the application logging and stdout will be located in the assigned yarn application master and will not appear to spark-submit. Also being streaming the application usually does not exit. Check the Hadoop resource manager web interface and look at the Spark web ui and logs that will be available from the Hadoop ui.

查看更多
登录 后发表回答