How to override Spark's log4j.properties per d

2019-02-02 08:50发布

I'm trying to override Spark's default log4j.properties, but haven't had any luck. I tried the adding the following to spark-submit:

--conf "spark.executor.extraJavaOptions=Dlog4j.configuration=/tmp/log4j.properties"  
--conf "spark.driver.extraJavaOptions=-Dlog4j.configuration=/tmp/log4j.properties"

But that didn't seem to work. I also tried using --files option in spark-submit and that also didn't seem to work. Has anyone got logging setup so you have a log4j.properties file per driver and not using the default?

I'm using Mesos and Marathon to run the Spark driver. I wasn't sure of the --files option and I couldn't find any examples of how it's used and what it does exactly.

I would also like to mention that I manually uploaded the log4j.properties file to all my nodes that had my changes for testing.

Version of Spark is 1.1.0 as of right now.

5条回答
淡お忘
2楼-- · 2019-02-02 09:08

For the driver/shell you can set this with the --driver-java-options when running spark-shell or spark-submit scripts.

In Spark you cannot set --conf spark.driver.extraJavaOptions because that is set after the JVM is started. When using the spark submit scripts --driver-java-options substitutes these options into the launch of the JVM (e.g. java -Dblah MyClass) that runs the driver.

Note that the -Dlog4j.configuration property should be a valid URL, so if its from somewhere on your file system use file: URL. If the resource variable cannot be converted to a URL, for example due to a MalformedURLException, then log4j will search for the resource from the classpath.

For example, to use a custom log4j.properties file;

./spark-shell --driver-java-options "-Dlog4j.configuration=file:///etc/spark/my-conf/log4j.warnonly.properties"
查看更多
萌系小妹纸
3楼-- · 2019-02-02 09:16

I could not make the

--conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=/tmp/log4j.properties"

or

--conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=file:///tmp/log4j.properties"

to work.

The only one that works for me is the --driver-java-options.

查看更多
淡お忘
4楼-- · 2019-02-02 09:19

There are multiple ways to achieve it, but it depends on your/application needs to choose the best one for your use case -

  • By providing extra java options to Spark Driver and Executor, while your log4j.properties is present at every node of the cluster at the same path (or local machine if you're running job locally), use the below command

    spark-submit --master local[2] --conf 'spark.driver.extraJavaOptions=Dlog4j.configuration=file:/tmp/log4j.properties' --conf 'spark.executor.extraJavaOptions=-Dlog4j.configuration=file:/tmp/log4j.properties' --class com.test.spark.application.TestSparkJob target/application-0.0.1-SNAPSHOT-jar-with-dependencies.jar prod

If log4j.properties is present in your jar at root classpath, then you can skip file: in the command, like below --conf 'spark.driver.extraJavaOptions=-Dlog4j.configuration=log4j.properties' --conf 'spark.executor.extraJavaOptions=-Dlog4j.configuration=log4j.properties'

  • By shipping your log4j.properties file to yarn and providing extra java options to Spark Driver and Executor, this way log4j.properties at every node is not required, yarn will manage in this scenario, use the below command

    spark-submit --master local[2] --files /tmp/log4j.properties --conf 'spark.driver.extraJavaOptions=Dlog4j.configuration=log4j.properties' --conf 'spark.executor.extraJavaOptions=-Dlog4j.configuration=log4j.properties' --class com.test.spark.application.TestSparkJob target/application-0.0.1-SNAPSHOT-jar-with-dependencies.jar prod

  • By changing the spark conf OR spark default log4j.properties file

    change or update log4j.properties at /etc/spark/conf.dist/log4j.properties

I have tried all these and worked for me, I would suggest also go through heading "Debugging your Application" in below spark post which is really helpful - https://spark.apache.org/docs/latest/running-on-yarn.html

查看更多
SAY GOODBYE
5楼-- · 2019-02-02 09:23

I don't believe the spark.driver.extraJavaOptions parameter exists. For spark.executor.extraJavaOptions it appears you have a typo. Try this:

--conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=/tmp/log4j.properties"
查看更多
小情绪 Triste *
6楼-- · 2019-02-02 09:25

Just a couple of details are off.

The conf flags should look like this:
--conf spark.executor.extraJavaOptions="-Dlog4j.configuration=log4j.properties" --conf spark.driver.extraJavaOptions="-Dlog4j.configuration=/tmp/log4j.properties" --files /tmp/log4j.properties

You'll also need to use the --files param to upload the log4j.properties file to the cluster, where executors can get to it. Also, as the configs are stated above assumes that you're using client mode, in cluster both configs would use the same relative path: -Dlog4j.configuration=log4j.properties

P.S. if your logging overrides also require additional dependencies you may need to provide them as well: --conf spark.driver.extraClassPath=custom-log4j-appender.jar See: custom-log4j-appender-in-spark-executor

Good luck

查看更多
登录 后发表回答