I have a Spark job that reads data from a configuration file. This file is a typesafe config file.
The code that reads the config looks like that:
ConfigFactory.load().getConfig("com.mycompany")
Now I don't assemble the application.conf as part of my uber jar since I want to pass the file as an external file
The content of the external application.conf I want to use looks like this:
com.mycompany {
//configurations my program needs
}
This application.conf file exists on my local machine file system (and not on HDFS)
I'm using Spark 1.6.1 with Yarn
This is how my spark-submit command looks like:
LOG4J_FULL_PATH=/log4j-path
ROOT_DIR=/application.conf-path
/opt/deploy/spark/bin/spark-submit \
--class com.mycompany.Main \
--master yarn \
--deploy-mode cluster \
--files $ROOT_DIR/application.conf \
--files $LOG4J_FULL_PATH/log4j.xml \
--conf spark.executor.extraClassPath="-Dconfig.file=file:application.conf" \
--driver-class-path $ROOT_DIR/application.conf \
--verbose \
/opt/deploy/lal-ml.jar
The exception I receive is:
2016-11-09 12:32:14 ERROR ApplicationMaster:95 - User class threw exception: com.typesafe.config.ConfigException$Missing: No configuration setting found for key 'com'
com.typesafe.config.ConfigException$Missing: No configuration setting found for key 'com'
at com.typesafe.config.impl.SimpleConfig.findKey(SimpleConfig.java:124)
at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:147)
at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:159)
at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:164)
at com.typesafe.config.impl.SimpleConfig.getObject(SimpleConfig.java:218)
at com.typesafe.config.impl.SimpleConfig.getConfig(SimpleConfig.java:224)
at com.typesafe.config.impl.SimpleConfig.getConfig(SimpleConfig.java:33)
at com.mycompany.Main$.main(Main.scala:36)
at com.mycompany.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542)
And so my question is: does anybody know how I can load an external typesafe application.conf file that sit on my local machine with spark-submit and yarn?
I tried following some of the solutions in How to add a typesafe config file which is located on HDFS to spark-submit (cluster-mode)? and in Typesafe Config in Spark and also in How to pass -D parameter or environment variable to Spark job? and nothing worked
I'll appreciate any direction to solving this
Thanks in advance