Using log4j2 in Spark java application

2019-04-03 02:25发布

I'm trying to use log4j2 logger in my Spark job. Essential requirement: log4j2 config is located outside classpath, so I need to specify its location explicitly. When I run my code directly within IDE without using spark-submit, log4j2 works well. However when I submit the same code to Spark cluster using spark-submit, it fails to find log42 configuration and falls back to default old log4j.

Launcher command

${SPARK_HOME}/bin/spark-submit \
--class my.app.JobDriver \  
--verbose \
--master 'local[*]' \
--files "log4j2.xml" \
--conf spark.executor.extraJavaOptions="-Dlog4j.configurationFile=log4j2.xml" \
--conf spark.driver.extraJavaOptions="-Dlog4j.configurationFile=log4j2.xml" \
myapp-SNAPSHOT.jar

Log4j2 dependencies in maven

<dependencies>
. . . 
        <dependency>
            <groupId>org.apache.logging.log4j</groupId>
            <artifactId>log4j-core</artifactId>
            <version>${log4j2.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.logging.log4j</groupId>
            <artifactId>log4j-api</artifactId>
            <version>${log4j2.version}</version>
        </dependency>
        <!-- Bridge log4j to log4j2 -->
        <dependency>
            <groupId>org.apache.logging.log4j</groupId>
            <artifactId>log4j-1.2-api</artifactId>
            <version>${log4j2.version}</version>
        </dependency>
        <!-- Bridge slf4j to log4j2 -->
        <dependency>
            <groupId>org.apache.logging.log4j</groupId>
            <artifactId>log4j-slf4j-impl</artifactId>
            <version>${log4j2.version}</version>
        </dependency>
 <dependencies>

Any ideas what I could miss?

4条回答
乱世女痞
2楼-- · 2019-04-03 02:40

If log4j2 is being used in one of your own dependencies, it's quite easy to bipass all configuration files and use programmatic configuration for one or two high level loggers IF and only IF the configuration file is not found.

The code below does the trick. Just name the logger to your top level logger.

private static boolean configured = false;

private static void buildLog()
{
    try
    {

        final LoggerContext ctx = (LoggerContext) LogManager.getContext(false);
        System.out.println("Configuration found at "+ctx.getConfiguration().toString());

        if(ctx.getConfiguration().toString().contains(".config.DefaultConfiguration"))
        {

            System.out.println("\n\n\nNo log4j2 config available. Configuring programmatically\n\n");

            ConfigurationBuilder<BuiltConfiguration> builder = ConfigurationBuilderFactory
                    .newConfigurationBuilder();

            builder.setStatusLevel(Level.ERROR);
            builder.setConfigurationName("IkodaLogBuilder");

            AppenderComponentBuilder appenderBuilder = builder.newAppender("Stdout", "CONSOLE")
                    .addAttribute("target", ConsoleAppender.Target.SYSTEM_OUT);
            appenderBuilder.add(builder.newLayout("PatternLayout").addAttribute("pattern",
                    "%d [%t]  %msg%n%throwable"));
            builder.add(appenderBuilder);               

            LayoutComponentBuilder layoutBuilder = builder.newLayout("PatternLayout").addAttribute("pattern",
                    "%d [%t] %-5level: %msg%n");

            appenderBuilder = builder.newAppender("file", "File").addAttribute("fileName", "./logs/ikoda.log")
                    .add(layoutBuilder);
            builder.add(appenderBuilder);


            builder.add(builder.newLogger("ikoda", Level.DEBUG)
                    .add(builder.newAppenderRef("file"))
                    .add(builder.newAppenderRef("Stdout"))
                    .addAttribute("additivity", false));

            builder.add(builder.newRootLogger(Level.DEBUG)
                    .add(builder.newAppenderRef("file"))
                    .add(builder.newAppenderRef("Stdout")));
            ((org.apache.logging.log4j.core.LoggerContext) LogManager.getContext(false)).start(builder.build());
            ctx.updateLoggers();
        }
        else
        {
            System.out.println("Configuration file found.");
        }
        configured=true;
    }
    catch(Exception e)
    {
        System.out.println("\n\n\n\nFAILED TO CONFIGURE LOG4J2"+e.getMessage());
        configured=true;
    }
}
查看更多
forever°为你锁心
3楼-- · 2019-04-03 02:48

Try using the --driver-java-options

${SPARK_HOME}/bin/spark-submit \    
--class my.app.JobDriver \      
--verbose \    
--master 'local[*]' \    
--files "log4j2.xml" \    
--driver-java-options "-Dlog4j.configuration=log4j2.xml" \    
--jars log4j-api-2.8.jar,log4j-core-2.8.jar,log4j-1.2-api-2.8.jar \    
myapp-SNAPSHOT.jar  
查看更多
\"骚年 ilove
4楼-- · 2019-04-03 02:56

Spark falls back to log4j because it probably cannot initialize logging system during startup (your application code is not added to classpath).

If you are permitted to place new files on your cluster nodes then create directory on all of them (for example /opt/spark_extras), place there all log4j2 jars and add two configuration options to spark-submit:

--conf spark.executor.extraClassPath=/opt/spark_extras/*
--conf spark.driver.extraClassPath=/opt/spark_extras/*

Then libraries will be added to classpath.

If you have no access to modify files on cluster you can try another approach. Add all log4j2 jars to spark-submit parameters using --jars. According to the documentation all these libries will be added to driver's and executor's classpath so it should work in the same way.

查看更多
一纸荒年 Trace。
5楼-- · 2019-04-03 03:00

Apparently at the moment there is no official support official for log4j2 in Spark. Here is detailed discussion on the subject: https://issues.apache.org/jira/browse/SPARK-6305

On practical side that means:

  1. If you have access to Spark configs and jars and can modify them, you still can use log4j2 after manually adding log4j2 jars to SPARK_CLASSPATH, and providing log4j2 configuration file to Spark.

  2. If you run on managed Spark cluster and have no access to Spark jars/configs, then you still can use log4j2, however its use will be limited to the code executed at driver side. Any code part running by executors will use Spark executors logger (which is old log4j)

查看更多
登录 后发表回答