I'm running Spark on a Yarn cluster and having log4j.properties configured such that all logs by default go to a log file. However, for some spark jobs I want the logs to go to console without changing the log4j file and the code of the actual job. What is the best way to achieve this? Thanks, all.
问题:
回答1:
I know there have at least 4 solutions for solving this problem.
You could modify your log4j.properties in your Spark machines
When you running the job on spark you better to attach the log4j file as configuration file submit to spark example
bin/spark-submit --class com.viaplay.log4jtest.log4jtest --conf "spark.driver.extraJavaOptions=-Dlog4j.configuration=file:/Users/feng/SparkLog4j/SparkLog4jTest/target/log4j2.properties" --master local[*] /Users/feng/SparkLog4j/SparkLog4jTest/target/SparkLog4jTest-1.0-jar-with-dependencies.jar
Try to import log4j to your logic code.
import org.apache.log4j.Logger; import org.apache.log4j.Level;
put those logger to your SparkContext() function Logger.getLogger("org").setLevel(Level.INFO); Logger.getLogger("akka").setLevel(Level.INFO);
Spark use spark.sql.SparkSession
import org.apache.spark.sql.SparkSession; spark = SparkSession.builder.getOrCreate() spark.sparkContext.setLogLevel('ERROR')
回答2:
Per the documentation: upload a custom log4j.properties using spark-submit, by adding it to the --files list of files to be uploaded with the application.
I just tried with a log4j.properties
file on a Yarn cluster and it works just fine.
spark-submit --class com.foo.Bar \
--master yarn-cluster \
--files path_to_my_log4j.properties \
my.jar