Idiomatic Python logging: format string + args lis

2019-04-19 09:39发布

Is it advantageous to call logging functions with format string + args list vs. formatting inline?

I've seen (and written) logging code that uses inline string formatting:

logging.warn("%s %s %s" % (arg1, arg2, arg3))

and yet I assume it's better (performance-wise, and more idiomatic) to use:

logging.warn("%s %s %s", arg1, arg2, arg3)

because the second form avoids string formatting operations prior to invoking the logging function. If the current logging level would filter out the log message, no formatting is necessary, reducing computing time and memory allocations.

Am I on the right track here, or have I missed something?

3条回答
相关推荐>>
2楼-- · 2019-04-19 10:20

In case this is helpful, here is a quick timing test for just the two formatting options:

In [61]: arg1='hello'
In [62]: arg2='this'
In [63]: arg3='is a test'

In [70]: timeit -n 10000000 "%s %s %s" % (arg1, arg2, arg3)
10000000 loops, best of 3: 284 ns per loop

In [71]: timeit -n 10000000  "%s %s %s", arg1, arg2, arg3
10000000 loops, best of 3: 119 ns per loop

seems to give the second approach the edge.

查看更多
Bombasti
3楼-- · 2019-04-19 10:22

Avoiding inline string formatting does save some time if the current logging level filters the log message (as I expected) -- but not much:

In [1]: import logging

In [2]: logger = logging.getLogger('foo')

In [3]: logger.setLevel(logging.ERROR)

In [4]: %timeit -n 1000000 logger.warn('%s %s %s' % ('a', 'b', 'c'))
1000000 loops, best of 3: 1.09 us per loop

In [12]: %timeit -n 1000000 logger.warn('%s %s %s', 'a', 'b', 'c')
1000000 loops, best of 3: 946 ns per loop

So, as user1202136 pointed out, the overall performance difference depends on how long it takes to format the string (which could be significant depending on the cost of calling __str__ on arguments being passed to the logging function.)

查看更多
老娘就宠你
4楼-- · 2019-04-19 10:25

IMHO, for messages that are very likely to be displayed, such as those given to error or warn it does not make much of a difference.

For messages that are less likely displayed, I would definitely go for the second version, mainly for performance reasons. I often give large objects as a parameter to info, which implement a costly __str__ method. Clearly, sending this pre-formatted to info would be a performance waste.

UPDATE

I just checked the source code of the logging module and, indeed, formatting is done after checking the log level. For example:

class Logger(Filterer):
    # snip
    def debug(self, msg, *args, **kwargs):
        # snip
        if self.isenabledfor(debug):
            self._log(debug, msg, args, **kwargs)

One can observe that msg and args are untouched between calling log and checking the log level.

UPDATE 2

Spired by Levon, let me add some tests for objects that have a costly __str__ method:

$ python -m timeit -n 1000000 -s "import logging" -s "logger = logging.getLogger('foo')" -s "logger.setLevel(logging.ERROR)" "logger.warn('%s', range(0,100))"
1000000 loops, best of 3: 1.52 usec per loop
$ python -m timeit -n 1000000 -s "import logging" -s "logger = logging.getLogger('foo')" -s "logger.setLevel(logging.ERROR)" "logger.warn('%s' % range(0,100))"
1000000 loops, best of 3: 10.4 usec per loop

In practice, this could give a fairly high performance boost.

查看更多
登录 后发表回答