Override Spark log4j configurations

2019-08-08 08:37发布

问题:

I'm running Spark on a Yarn cluster and having log4j.properties configured such that all logs by default go to a log file. However, for some spark jobs I want the logs to go to console without changing the log4j file and the code of the actual job. What is the best way to achieve this? Thanks, all.

回答1:

I know there have at least 4 solutions for solving this problem.

  1. You could modify your log4j.properties in your Spark machines

  2. When you running the job on spark you better to attach the log4j file as configuration file submit to spark example

    bin/spark-submit --class com.viaplay.log4jtest.log4jtest --conf "spark.driver.extraJavaOptions=-Dlog4j.configuration=file:/Users/feng/SparkLog4j/SparkLog4jTest/target/log4j2.properties" --master local[*] /Users/feng/SparkLog4j/SparkLog4jTest/target/SparkLog4jTest-1.0-jar-with-dependencies.jar

  3. Try to import log4j to your logic code.

    import org.apache.log4j.Logger; import org.apache.log4j.Level;

    put those logger to your SparkContext() function Logger.getLogger("org").setLevel(Level.INFO); Logger.getLogger("akka").setLevel(Level.INFO);

  4. Spark use spark.sql.SparkSession

    import org.apache.spark.sql.SparkSession; spark = SparkSession.builder.getOrCreate() spark.sparkContext.setLogLevel('ERROR')



回答2:

Per the documentation: upload a custom log4j.properties using spark-submit, by adding it to the --files list of files to be uploaded with the application.

I just tried with a log4j.properties file on a Yarn cluster and it works just fine.

spark-submit --class com.foo.Bar \
  --master yarn-cluster \
  --files path_to_my_log4j.properties \
  my.jar