Spark UI showing 0 cores even when setting cores i

2019-03-31 11:12发布

问题:

I am having a strange issue with running an application off of the spark master url where the UI is reporting a "STATE" of "WAITING" indefinitely as 0 cores are showing up under the RUNNING APPLICATIONs table no matter what I configure the core count to be.

Ive configured my app with the following settings where spark.max.cores = 2 & spark.default.cores = 2 & memory set to 3GB. The machine is an enterprise class server with over 24 cores.

        SparkConf conf = new SparkConf()
            .setAppName(Properties.getString("SparkAppName"))
            .setMaster(Properties.getString("SparkMasterUrl"))
            .set("spark.executor.memory", Properties.getString("SparkExecMem"))
            .set("spark.cores.max",Properties.getString("SparkCores"))
            .set("spark.driver.memory",Properties.getString("SparkDriverMem"))
            .set("spark.eventLog.enabled", "true")
            .set("spark.deploy.defaultCores",Properties.getString("SparkDefaultCores"));

    //Set Spark context
    JavaSparkContext sc = new JavaSparkContext(conf);
    JavaStreamingContext jssc = new JavaStreamingContext(sc, new Duration(5000));

Spark WebUI states zero cores used and indefinite wait no tasks running. The application is also using NO MEMORY whatsoever during run time or cores and immediately hits a status of waiting when starting.

Spark-defaults.conf 
spark.yarn.max_executor.failures         3
spark.yarn.applicationMaster.waitTries   10
spark.history.kerberos.keytab    none
spark.yarn.preserve.staging.files        False
spark.yarn.submit.file.replication       3
spark.history.kerberos.principal         none
spark.yarn.historyServer.address         {removed}.{removed}.com:18080
spark.yarn.scheduler.heartbeat.interval-ms       5000
spark.yarn.queue         default
spark.yarn.containerLauncherMaxThreads   25
spark.yarn.driver.memoryOverhead         384
spark.history.ui.port    18080
spark.yarn.services      org.apache.spark.deploy.yarn.history.YarnHistoryService
spark.yarn.max.executor.failures         3
spark.driver.extraJavaOptions     -Dhdp.version=2.2.6.0-2800
spark.history.provider   org.apache.spark.deploy.yarn.history.YarnHistoryProvider
spark.yarn.am.extraJavaOptions    -Dhdp.version=2.2.6.0-2800
spark.yarn.executor.memoryOverhead       384

Submit script

spark-submit --class {removed}.{removed}.{removed}.sentiment.MainApp --deploy-mode client /path/to/jar

EDITED: 2/3/2016 After running with --master yarn-cluster I am receiving this in the yarn logs error. I have also included my updated submit configuration

Submit Configuration

spark-submit --class com.removed.removed.sentiment.MainApp 
--master yarn-cluster --supervise 
/data04/dev/removed/spark/twitternpi/npi.sentiment-1.0-SNAPSHOT-shaded.jar 
--jars /usr/hdp/2.2.6.0-2800/spark/lib/datanucleus-core-3.2.10.jar,/usr/hdp/2.2.6.0-2800/spark/lib/datanucleus-api-jdo-3.2.6.jar,/usr/hdp/2.2.6.0-2800/spark/lib/datanucleus-rdbms-3.2.9.jar,/usr/hdp/2.2.6.0-2800/spark/lib/spark-1.2.1.2.2.6.0-2800-yarn-shuffle.jar,/usr/hdp/2.2.6.0-2800/spark/lib/spark-assembly-1.2.1.2.2.6.0-2800-hadoop2.6.0.2.2.6.0-2800.jar

Error Message

   ClassLoaderResolver for class "" gave error on creation : {1}
org.datanucleus.exceptions.NucleusUserException: ClassLoaderResolver for class "" gave error on creation : {1}
    at org.datanucleus.NucleusContext.getClassLoaderResolver(NucleusContext.java:1087)
    at org.datanucleus.PersistenceConfiguration.validatePropertyValue(PersistenceConfiguration.java:797)
    at org.datanucleus.PersistenceConfiguration.setProperty(PersistenceConfiguration.java:714)
    at org.datanucleus.PersistenceConfiguration.setPersistenceProperties(PersistenceConfiguration.java:693)
    at org.datanucleus.NucleusContext.<init>(NucleusContext.java:273)
    at org.datanucleus.NucleusContext.<init>(NucleusContext.java:247)
    at org.datanucleus.NucleusContext.<init>(NucleusContext.java:225)

回答1:

I ran into this problem when the required memory size for the executor, set by spark.executor.memory in spark-defaults.conf, is bigger than that on the AWS node. But since you only set 3.0 GB as your memory, I think there might be other causes in your case.



回答2:

If you're running in yarn, you need to tell your application to use yarn. Add master yarn-cluster to your spark-submit command

spark-submit --class your_class --master yarn-cluster /path/to/jar

EDIT:

The spark.cores.max is for Mesos or Standalone. Try setting this:

.set("spark.executor.cores","2")

And at runtime add this to the submit

--num-executors=2

I am curious though, as it should default to 1 core per executor. Are the worker nodes registered with the YARN for Spark? Have you successfully used Spark at all on this cluster in yarn-client or yarn-cluster mode?



回答3:

Please check the maximum number of cores allocated for containers in yarn configuration in yarn-site.xml. Sometimes across enterprise yarn queue set up also done to evenly distribute the resources across the projects