可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I\'d like to stop various messages that are coming on spark shell.
I tried to edit the log4j.properties
file in order to stop these message.
Here are the contents of log4j.properties
# Define the root logger with appender file
log4j.rootCategory=WARN, console
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n
# Settings to quiet third party logs that are too verbose
log4j.logger.org.eclipse.jetty=WARN
log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR
log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO
But messages are still getting displayed on the console.
Here are some example messages
15/01/05 15:11:45 INFO SparkEnv: Registering BlockManagerMaster
15/01/05 15:11:45 INFO DiskBlockManager: Created local directory at /tmp/spark-local-20150105151145-b1ba
15/01/05 15:11:45 INFO MemoryStore: MemoryStore started with capacity 0.0 B.
15/01/05 15:11:45 INFO ConnectionManager: Bound socket to port 44728 with id = ConnectionManagerId(192.168.100.85,44728)
15/01/05 15:11:45 INFO BlockManagerMaster: Trying to register BlockManager
15/01/05 15:11:45 INFO BlockManagerMasterActor$BlockManagerInfo: Registering block manager 192.168.100.85:44728 with 0.0 B RAM
15/01/05 15:11:45 INFO BlockManagerMaster: Registered BlockManager
15/01/05 15:11:45 INFO HttpServer: Starting HTTP Server
15/01/05 15:11:45 INFO HttpBroadcast: Broadcast server star
How do I stop these?
回答1:
Edit your conf/log4j.properties
file and change the following line:
log4j.rootCategory=INFO, console
to
log4j.rootCategory=ERROR, console
Another approach would be to :
Start spark-shell and type in the following:
import org.apache.log4j.Logger
import org.apache.log4j.Level
Logger.getLogger(\"org\").setLevel(Level.OFF)
Logger.getLogger(\"akka\").setLevel(Level.OFF)
You won\'t see any logs after that.
Other options for Level include: all
, debug
, error
, fatal
, info
, off
, trace
, trace_int
, warn
Details about each can be found in the documentation.
回答2:
Right after starting spark-shell
type ;
sc.setLogLevel(\"ERROR\")
In Spark 2.0:
spark = SparkSession.builder.getOrCreate()
spark.sparkContext.setLogLevel(\"ERROR\")
回答3:
Thanks @AkhlD and @Sachin Janani for suggesting changes in .conf
file.
Following code solved my issue:
1) Added import org.apache.log4j.{Level, Logger}
in import section
2) Added following line after creation of spark context object i.e. after val sc = new SparkContext(conf)
:
val rootLogger = Logger.getRootLogger()
rootLogger.setLevel(Level.ERROR)
回答4:
Use below command to change log level while submitting application using spark-submit or spark-sql:
spark-submit \\
--conf \"spark.driver.extraJavaOptions=-Dlog4j.configuration=file:<file path>/log4j.xml\" \\
--conf \"spark.executor.extraJavaOptions=-Dlog4j.configuration=file:<file path>/log4j.xml\"
Note: replace <file path>
where log4j
config file is stored.
Log4j.properties:
log4j.rootLogger=ERROR, console
# set the log level for these components
log4j.logger.com.test=DEBUG
log4j.logger.org=ERROR
log4j.logger.org.apache.spark=ERROR
log4j.logger.org.spark-project=ERROR
log4j.logger.org.apache.hadoop=ERROR
log4j.logger.io.netty=ERROR
log4j.logger.org.apache.zookeeper=ERROR
# add a ConsoleAppender to the logger stdout to write to the console
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.layout=org.apache.log4j.PatternLayout
# use a simple message format
log4j.appender.console.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n
log4j.xml
<?xml version=\"1.0\" encoding=\"UTF-8\" ?>
<!DOCTYPE log4j:configuration SYSTEM \"log4j.dtd\">
<log4j:configuration xmlns:log4j=\"http://jakarta.apache.org/log4j/\">
<appender name=\"console\" class=\"org.apache.log4j.ConsoleAppender\">
<param name=\"Target\" value=\"System.out\"/>
<layout class=\"org.apache.log4j.PatternLayout\">
<param name=\"ConversionPattern\" value=\"%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n\" />
</layout>
</appender>
<logger name=\"org.apache.spark\">
<level value=\"error\" />
</logger>
<logger name=\"org.spark-project\">
<level value=\"error\" />
</logger>
<logger name=\"org.apache.hadoop\">
<level value=\"error\" />
</logger>
<logger name=\"io.netty\">
<level value=\"error\" />
</logger>
<logger name=\"org.apache.zookeeper\">
<level value=\"error\" />
</logger>
<logger name=\"org\">
<level value=\"error\" />
</logger>
<root>
<priority value =\"ERROR\" />
<appender-ref ref=\"console\" />
</root>
</log4j:configuration>
Switch to FileAppender in log4j.xml if you want to write logs to file instead of console. LOG_DIR
is a variable for logs directory which you can supply using spark-submit --conf \"spark.driver.extraJavaOptions=-D
.
<appender name=\"file\" class=\"org.apache.log4j.DailyRollingFileAppender\">
<param name=\"file\" value=\"${LOG_DIR}\"/>
<param name=\"datePattern\" value=\"\'.\'yyyy-MM-dd\"/>
<layout class=\"org.apache.log4j.PatternLayout\">
<param name=\"ConversionPattern\" value=\"%d [%t] %-5p %c %x - %m%n\"/>
</layout>
</appender>
Another important thing to understand here is, when job is launched in distributed mode ( deploy-mode cluster and master as yarn or mesos) the log4j configuration file should exist on driver and worker nodes (log4j.configuration=file:<file path>/log4j.xml
) else log4j init will complain-
log4j:ERROR Could not read configuration file [log4j.properties].
java.io.FileNotFoundException: log4j.properties (No such file or
directory)
Hint on solving this problem-
Keep log4j config file in distributed file system(HDFS or mesos) and add external configuration using log4j PropertyConfigurator.
or use sparkContext addFile to make it available on each node then use log4j PropertyConfigurator to reload configuration.
回答5:
You set disable the Logs by setting its level to OFF as follows:
Logger.getLogger(\"org\").setLevel(Level.OFF);
Logger.getLogger(\"akka\").setLevel(Level.OFF);
or edit log file and set log level to off by just changing the following property:
log4j.rootCategory=OFF, console
回答6:
I just add this line to all my pyspark scripts on top just below the import statements.
SparkSession.builder.getOrCreate().sparkContext.setLogLevel(\"ERROR\")
example header of my pyspark scripts
from pyspark.sql import SparkSession, functions as fs
SparkSession.builder.getOrCreate().sparkContext.setLogLevel(\"ERROR\")
回答7:
Answers above are correct but didn\'t exactly help me as there was additional information I required.
I have just setup Spark so the log4j file still had the \'.template\' suffix and wasn\'t being read. I believe that logging then defaults to Spark core logging conf.
So if you are like me and find that the answers above didn\'t help, then maybe you too have to remove the \'.template\' suffix from your log4j conf file and then the above works perfectly!
http://apache-spark-user-list.1001560.n3.nabble.com/disable-log4j-for-spark-shell-td11278.html
回答8:
tl;dr
For Spark Context you may use:
sc.setLogLevel(<logLevel>)
where loglevel
can be ALL, DEBUG, ERROR, FATAL, INFO, OFF, TRACE or
WARN.
Details-
Internally, setLogLevel
calls org.apache.log4j.Level.toLevel(logLevel)
that it then uses to set using org.apache.log4j.LogManager.getRootLogger().setLevel(level)
.
You may directly set the logging levels to OFF
using:
LogManager.getLogger(\"org\").setLevel(Level.OFF)
You can set up the default logging for Spark shell in conf/log4j.properties
. Use conf/log4j.properties.template
as a starting point.
Setting Log Levels in Spark Applications
In standalone Spark applications or while in Spark Shell session, use the following:
import org.apache.log4j.{Level, Logger}
Logger.getLogger(classOf[RackResolver]).getLevel
Logger.getLogger(\"org\").setLevel(Level.OFF)
Logger.getLogger(\"akka\").setLevel(Level.OFF)
Disabling logging(in log4j):
Use the following in conf/log4j.properties
to disable logging completely:
log4j.logger.org=OFF
Reference: Mastering Spark by Jacek Laskowski.
回答9:
In Python/Spark we can do:
def quiet_logs( sc ):
logger = sc._jvm.org.apache.log4j
logger.LogManager.getLogger(\"org\"). setLevel( logger.Level.ERROR )
logger.LogManager.getLogger(\"akka\").setLevel( logger.Level.ERROR )
The after defining Sparkcontaxt \'sc\'
call this function by : quiet_logs( sc )
回答10:
An interesting idea is to use the RollingAppender as suggested here: http://shzhangji.com/blog/2015/05/31/spark-streaming-logging-configuration/
so that you don\'t \"polute\" the console space, but still be able to see the results under $YOUR_LOG_PATH_HERE/${dm.logging.name}.log.
log4j.rootLogger=INFO, rolling
log4j.appender.rolling=org.apache.log4j.RollingFileAppender
log4j.appender.rolling.layout=org.apache.log4j.PatternLayout
log4j.appender.rolling.layout.conversionPattern=[%d] %p %m (%c)%n
log4j.appender.rolling.maxFileSize=50MB
log4j.appender.rolling.maxBackupIndex=5
log4j.appender.rolling.file=$YOUR_LOG_PATH_HERE/${dm.logging.name}.log
log4j.appender.rolling.encoding=UTF-8
Another method that solves the cause is to observe what kind of loggings do you usually have (coming from different modules and dependencies), and set for each the granularity for the logging, while turning \"quiet\" third party logs that are too verbose:
For instance,
# Silence akka remoting
log4j.logger.Remoting=ERROR
log4j.logger.akka.event.slf4j=ERROR
log4j.logger.org.spark-project.jetty.server=ERROR
log4j.logger.org.apache.spark=ERROR
log4j.logger.com.anjuke.dm=${dm.logging.level}
log4j.logger.org.eclipse.jetty=WARN
log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR
log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO
回答11:
Simple to do on the command line...
spark2-submit --driver-java-options=\"-Droot.logger=ERROR,console\"
..other options..
回答12:
Simply add below param to your spark-shell OR spark-submit command
--conf \"spark.driver.extraJavaOptions=-Dlog4jspark.root.logger=WARN,console\"
Check exact property name (log4jspark.root.logger here) from log4j.properties file.
Hope this helps, cheers!
回答13:
- Adjust conf/log4j.properties as described by other
log4j.rootCategory=ERROR, console
- Make sure while executing your spark job you pass --file flag with log4j.properties file path
- If it still doesn\'t work you might have a jar that has log4j.properties that is being called before your new log4j.properties. Remove that log4j.properties from jar (if appropriate)
回答14:
sparkContext.setLogLevel(\"OFF\")
回答15:
In addition to all the above posts, here is what solved the issue for me.
Spark uses slf4j to bind to loggers. If log4j is not the first binding found, you can edit log4j.properties files all you want, the loggers are not even used. For example, this could be a possible SLF4J output:
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/C:/Users/~/.m2/repository/org/slf4j/slf4j-simple/1.6.6/slf4j-simple-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/C:/Users/~/.m2/repository/org/slf4j/slf4j-log4j12/1.7.19/slf4j-log4j12-1.7.19.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.SimpleLoggerFactory]
So here the SimpleLoggerFactory was used, which does not care about log4j settings.
Excluding the slf4j-simple package from my project via
<dependency>
...
<exclusions>
...
<exclusion>
<artifactId>slf4j-simple</artifactId>
<groupId>org.slf4j</groupId>
</exclusion>
</exclusions>
</dependency>
resolved the issue, as now the log4j logger binding is used and any setting in log4j.properties is adhered to.
F.Y.I. my log4j properties file contains (besides the normal configuration)
log4j.rootLogger=WARN, stdout
...
log4j.category.org.apache.spark = WARN
log4j.category.org.apache.parquet.hadoop.ParquetRecordReader = FATAL
log4j.additivity.org.apache.parquet.hadoop.ParquetRecordReader=false
log4j.logger.org.apache.parquet.hadoop.ParquetRecordReader=OFF
Hope this helps!