Logging in a Framework

Imagine there is a framework which provides a method called logutils.set_up() which sets up the logging according to some config.

Setting up the logging should be done as early as possible since warnings emitted during importing libraries should not be lost.

Since the old way (if __name__=='__main__':) looks ugly, we use console_script entrypoints to register the main() method.

# foo/daily_report.py
from framework import logutils
logutils.set_up()
def main():
    ...

My problem is that logutils.set_up() might be called twice:

Imagine there is a second console script which calls logutils.set_up() and imports daily_report.py.

I can change the framework code and set_up() to do nothing in the second call to logutils.set_up(), but this feels clumsy. I would like to avoid it.

How can I be sure that logutils.set_up() gets only executed once?

标签： python logging

7条回答

淡お忘

2楼-- · 2019-06-20 19:31

FWIW, I think that having the logutils package defend itself against multiple calls to its setup is not clumsy, it's dealing with the problem in the right place. After you've done that, your logutils package is now more robust.

Any other solution, "outside" of logutils, is susceptible to bugs due to being overlooked in some case.

0人赞添加讨论(0) 举报

【Aperson】

3楼-- · 2019-06-20 19:35

There are a few ways to achieve the goal, each with its advantages and disadvantages.

(some of these overlap with the other answers. I don't mean to plagiarize, only to provide a comprehensive answer).

Approach 1: The function should do it

One way to guarantee a function only gets executed once, is to make the function itself stateful, making it "remember" it has already been called. This is more or less what is described by @eestrada and @qarma.

As to implementing this, I agree with @qarma that using memoization is the simplest and most ideomatic way. There are a few simple memoization decorators for python on the internet. The one included in the standard library is functools.lru_cache. You can simply use it like:

@functools.lru_cache
def set_up():  # this is your original set_up() function, now decorated
    <...same as before...>

The disadvantage here is that it is arguably not the set_up's responsibility to maintain the state, it is merely a function. One can argue it should execute twice if being called twice, and it's caller's responsibility to only call it when it needs it (what if you really do want to run it twice)? The general argument is that a function (in order to be useful and reusable) should not make assumptions about the context in which it is called.

Is this argument valid in your case? It is up to you to decide.

Another disadvantage here is that this can be cosidered an abuse of the memoization tool. Memoization is a tool closely related to functional programming, and should be applied to pure functions. Memoizing a funciton implies "no need to run it again, because we already know the result", and not "no need to run it again, because there's some side effect we want to avoid".

Approach 2: the one you think is ugly (if __name__=='__main__')

The most common pythonic way, which you already mention in your question, is using the infamous if __name__=='__main__' construct.

This guarantees the function is only called once, because it is only called from the module named __main__, and the interpreter guarantees there is only one such module in your process.

This works. There are no complications nor caveats. This is the way running main-code (including setup code) is done in python. It is considered pythonic simply because it is so darn common in python (since there are no better ways).

The only disadvantage is that it is arguably ugly (asthetics-wise, not code-quality-wise). I admit I also winced the first few times I saw it or wrote it, but it grows on you.

Approach 3: leverage python's module-importing mechanism

Python already has a caching mechanism preventing modules from being doubly-imported. You can leverage this mechanism by running the setup code in a new module, then import it. This is similar to @rll's answer. This is simple, to do:

# logging_setup.py
from framework import logutils
logutils.set_up()

Now, each caller can run this by importing the new module:

# foo/daily_report.py
import logging_setup # side effect!
def main():
    ...

Since a module is only imported once, set_up is only called once.

The disadvantage here is that it violates the "explicit is better than implicit" principle. I.e. if you want to call a function, call it. It isn't good practice to run code with side-effects on module-import time.

Approach 4: monkey patching

This is by far the worst of the approaches in this answer. Don't use it. But it is still a way to get the job done.

The idea is that if you don't want the function to get called after the first call, monkey-patch it (read: vandalize it) after the first call.

from framework import logutils
logutils.set_up_only_once()

Where set_up_only_once can be implemented like:

def set_up_only_once():
    # run the actual setup (or nothing if already vandalized):
    set_up()
    # vandalize it so it never walks again:
    import sys
    sys.modules['logutils'].set_up = lambda: None

Disadvantages: your colleagues will hate you.

tl;dr:

The simplest way is to memoize using functools.lru_cache, but it might not be the best solution code-quality-wise. It is up to you if this solution is good enough in your case.

The safest and most pythonic way, while not pleasing to the eye, is using if __name__=='__main__': ....

0人赞添加讨论(0) 举报

爷、活的狠高调

4楼-- · 2019-06-20 19:38

It isn't clumsy to change the set_up code; indeed, it is the only way to know for certain that the initialization is only done once. Here is some black magic for you, if you would rather not use an if statement to do the job:

# framework/logutils.py
def _set_up_internal():
    global set_up
    # NOTE: start setup
    return_val = None  # if there is a useful return value
    # NOTE: finish setup

    # clobber global reference with a dummy implementation
    set_up = lamdba: return_val  # or return `None` if there is no useful return value

set_up = _set_up_internal

No explicit check is required and it is assured to only ever call the function once. This isn't thread safe, but I am assuming that that isn't a requirement (since it wasn't mentioned in the question).

0人赞添加讨论(0) 举报

小情绪 Triste *

5楼-- · 2019-06-20 19:39

For the sake of completeness, I will add a solution --there are plenty of valid options here, but maybe this one fills a gap.

It is a little verbose, but I feel it is relatively simple and clean:

# foo/daily_report.py
from framework import logutils

if not hasattr(logutils.set_up, "_initiated"):
    logutils.set_up()
    logutils.set_up._initiated = True

def main():
    pass

This way you are not actively changing the function itself... or not so much: you are adding an attribute which you check. Instead of attaching that attribute to the function you could put it somewhere else, or wrap the initialization entirely to a singleton class. But those solutions have already been proposed, if I'm not mistaken.

Three lines that should be put in every call to set_up.

The problem would be if some code calls to set_up without setting that attribute (because a bug, because external dependency, whatever). But, if that is the case, then you have no other options but to change the code or to check the behaviour of the function itself.

Note: I assume that the framework's function set_up is Pure Python. I assume that this won't work for C extension functions or built-ins, but I have not checked those.

0人赞添加讨论(0) 举报

一纸荒年 Trace。

6楼-- · 2019-06-20 19:40

You can use singletons. A Singleton class gets created only once, and subsequent calls will point to the same object instead of creating a new one. This answer explains different ways of creating a singleton class (all simple).

I personally prefer the base class approach. You first define Singleton class as below:

class Singleton(type):
    _instances = {}
    def __call__(cls, *args, **kwargs):
        if cls not in cls._instances:
            cls._instances[cls] = super(Singleton, cls).__call__(*args, **kwargs)
        return cls._instances[cls]

then use it as a meta class like this:

class MyClass(object):
    __metaclass__ = Singleton

    "the rest of you class as normal"

Now the first time you call MyClass() it will create the object for you. Subsequent calls will refer to the same object (sort of like a global variable of classes!)

0人赞添加讨论(0) 举报

成全新的幸福

7楼-- · 2019-06-20 19:46

Just slap @memoize on your set_up and then it is only called once :)

0人赞添加讨论(0) 举报

1 2 下一页

Logging in a Framework

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间