Connecting Log4j to ipython notebook stderr in a j

2019-08-01 19:40发布

I have a project that uses java, scala and Apache Spark to do distributed computations on genomic data. Using py4j and mimicking the PySpark model, we expose a python API that calls into the JVM. Our goal has been to bring this model into jupyter notebooks, which has been pretty easy so far, with one lingering problem: logging.

The problem

We (and Spark) use log4j to write log messages to a log file and stderr. This stderr is the stderr for the java process, so if I run two commands from the jupyter notebook:

print('foo')
info('bar')  # calls log4j logger.info in JVM

I see 'foo' written to the jupyter cell, but 'bar' is written to the terminal running the jupyter process.

My goal

Connect log4j to the jupyter notebook so that log4j messages are written to jupyter cells, instead of the terminal.

What I've tried

The java log4j.ConsoleAppender is writing to the java stderr. So, we're going to need to route the java stderr through jupyter somehow, right? This may involve using System.setOut(...) with a PrintStream object hooked up to the jupyter process, but I'm not yet sure how to do that.

1条回答
太酷不给撩
2楼-- · 2019-08-01 20:14

We solved this by using a separate socket to communicate between Java and Python. Here's the commit diff: https://github.com/hail-is/hail/commit/93d7e95a82ab39501eede7ecb301538bcd013ea8

查看更多
登录 后发表回答