How to stop INFO messages displaying on spark cons

2019-01-01 06:35发布

I'd like to stop various messages that are coming on spark shell.

I tried to edit the log4j.properties file in order to stop these message.

Here are the contents of log4j.properties

# Define the root logger with appender file
log4j.rootCategory=WARN, console
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n

# Settings to quiet third party logs that are too verbose
log4j.logger.org.eclipse.jetty=WARN
log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR
log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO

But messages are still getting displayed on the console.

Here are some example messages

15/01/05 15:11:45 INFO SparkEnv: Registering BlockManagerMaster
15/01/05 15:11:45 INFO DiskBlockManager: Created local directory at /tmp/spark-local-20150105151145-b1ba
15/01/05 15:11:45 INFO MemoryStore: MemoryStore started with capacity 0.0 B.
15/01/05 15:11:45 INFO ConnectionManager: Bound socket to port 44728 with id = ConnectionManagerId(192.168.100.85,44728)
15/01/05 15:11:45 INFO BlockManagerMaster: Trying to register BlockManager
15/01/05 15:11:45 INFO BlockManagerMasterActor$BlockManagerInfo: Registering block manager 192.168.100.85:44728 with 0.0 B RAM
15/01/05 15:11:45 INFO BlockManagerMaster: Registered BlockManager
15/01/05 15:11:45 INFO HttpServer: Starting HTTP Server
15/01/05 15:11:45 INFO HttpBroadcast: Broadcast server star

How do I stop these?

15条回答
有味是清欢
2楼-- · 2019-01-01 07:25

In Python/Spark we can do:

def quiet_logs( sc ):
  logger = sc._jvm.org.apache.log4j
  logger.LogManager.getLogger("org"). setLevel( logger.Level.ERROR )
  logger.LogManager.getLogger("akka").setLevel( logger.Level.ERROR )

The after defining Sparkcontaxt 'sc' call this function by : quiet_logs( sc )

查看更多
大哥的爱人
3楼-- · 2019-01-01 07:26

I just add this line to all my pyspark scripts on top just below the import statements.

SparkSession.builder.getOrCreate().sparkContext.setLogLevel("ERROR")

example header of my pyspark scripts

from pyspark.sql import SparkSession, functions as fs
SparkSession.builder.getOrCreate().sparkContext.setLogLevel("ERROR")
查看更多
像晚风撩人
4楼-- · 2019-01-01 07:27

Use below command to change log level while submitting application using spark-submit or spark-sql:

spark-submit \
--conf "spark.driver.extraJavaOptions=-Dlog4j.configuration=file:<file path>/log4j.xml" \
--conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=file:<file path>/log4j.xml"

Note: replace <file path> where log4j config file is stored.

Log4j.properties:

log4j.rootLogger=ERROR, console

# set the log level for these components
log4j.logger.com.test=DEBUG
log4j.logger.org=ERROR
log4j.logger.org.apache.spark=ERROR
log4j.logger.org.spark-project=ERROR
log4j.logger.org.apache.hadoop=ERROR
log4j.logger.io.netty=ERROR
log4j.logger.org.apache.zookeeper=ERROR

# add a ConsoleAppender to the logger stdout to write to the console
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.layout=org.apache.log4j.PatternLayout
# use a simple message format
log4j.appender.console.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n

log4j.xml

<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE log4j:configuration SYSTEM "log4j.dtd">

<log4j:configuration xmlns:log4j="http://jakarta.apache.org/log4j/">
   <appender name="console" class="org.apache.log4j.ConsoleAppender">
    <param name="Target" value="System.out"/>
    <layout class="org.apache.log4j.PatternLayout">
    <param name="ConversionPattern" value="%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n" />
    </layout>
  </appender>
    <logger name="org.apache.spark">
        <level value="error" />
    </logger>
    <logger name="org.spark-project">
        <level value="error" />
    </logger>
    <logger name="org.apache.hadoop">
        <level value="error" />
    </logger>
    <logger name="io.netty">
        <level value="error" />
    </logger>
    <logger name="org.apache.zookeeper">
        <level value="error" />
    </logger>
   <logger name="org">
        <level value="error" />
    </logger>
    <root>
        <priority value ="ERROR" />
        <appender-ref ref="console" />
    </root>
</log4j:configuration>

Switch to FileAppender in log4j.xml if you want to write logs to file instead of console. LOG_DIR is a variable for logs directory which you can supply using spark-submit --conf "spark.driver.extraJavaOptions=-D.

<appender name="file" class="org.apache.log4j.DailyRollingFileAppender">
        <param name="file" value="${LOG_DIR}"/>
        <param name="datePattern" value="'.'yyyy-MM-dd"/>
        <layout class="org.apache.log4j.PatternLayout">
            <param name="ConversionPattern" value="%d [%t] %-5p %c %x - %m%n"/>
        </layout>
    </appender>

Another important thing to understand here is, when job is launched in distributed mode ( deploy-mode cluster and master as yarn or mesos) the log4j configuration file should exist on driver and worker nodes (log4j.configuration=file:<file path>/log4j.xml) else log4j init will complain-

log4j:ERROR Could not read configuration file [log4j.properties]. java.io.FileNotFoundException: log4j.properties (No such file or directory)

Hint on solving this problem-

Keep log4j config file in distributed file system(HDFS or mesos) and add external configuration using log4j PropertyConfigurator. or use sparkContext addFile to make it available on each node then use log4j PropertyConfigurator to reload configuration.

查看更多
登录 后发表回答