Imagine there is a framework which provides a method called logutils.set_up()
which sets up the logging according to some config.
Setting up the logging should be done as early as possible since warnings emitted during importing libraries should not be lost.
Since the old way (if __name__=='__main__':
) looks ugly, we use console_script entrypoints to register the main()
method.
# foo/daily_report.py
from framework import logutils
logutils.set_up()
def main():
...
My problem is that logutils.set_up()
might be called twice:
Imagine there is a second console script which calls logutils.set_up()
and imports daily_report.py
.
I can change the framework code and set_up()
to do nothing in the second call to logutils.set_up()
, but this feels clumsy. I would like to avoid it.
How can I be sure that logutils.set_up()
gets only executed once?
FWIW, I think that having the logutils package defend itself against multiple calls to its setup is not clumsy, it's dealing with the problem in the right place. After you've done that, your logutils package is now more robust.
Any other solution, "outside" of logutils, is susceptible to bugs due to being overlooked in some case.
There are a few ways to achieve the goal, each with its advantages and disadvantages.
(some of these overlap with the other answers. I don't mean to plagiarize, only to provide a comprehensive answer).
Approach 1: The function should do it
One way to guarantee a function only gets executed once, is to make the function itself stateful, making it "remember" it has already been called. This is more or less what is described by @eestrada and @qarma.
As to implementing this, I agree with @qarma that using memoization is the simplest and most ideomatic way. There are a few simple memoization decorators for python on the internet. The one included in the standard library is
functools.lru_cache
. You can simply use it like:The disadvantage here is that it is arguably not the
set_up
's responsibility to maintain the state, it is merely a function. One can argue it should execute twice if being called twice, and it's caller's responsibility to only call it when it needs it (what if you really do want to run it twice)? The general argument is that a function (in order to be useful and reusable) should not make assumptions about the context in which it is called.Is this argument valid in your case? It is up to you to decide.
Another disadvantage here is that this can be cosidered an abuse of the memoization tool. Memoization is a tool closely related to functional programming, and should be applied to pure functions. Memoizing a funciton implies "no need to run it again, because we already know the result", and not "no need to run it again, because there's some side effect we want to avoid".
Approach 2: the one you think is ugly (
if __name__=='__main__'
)The most common pythonic way, which you already mention in your question, is using the infamous
if __name__=='__main__'
construct.This guarantees the function is only called once, because it is only called from the module named
__main__
, and the interpreter guarantees there is only one such module in your process.This works. There are no complications nor caveats. This is the way running main-code (including setup code) is done in python. It is considered pythonic simply because it is so darn common in python (since there are no better ways).
The only disadvantage is that it is arguably ugly (asthetics-wise, not code-quality-wise). I admit I also winced the first few times I saw it or wrote it, but it grows on you.
Approach 3: leverage python's module-importing mechanism
Python already has a caching mechanism preventing modules from being doubly-imported. You can leverage this mechanism by running the setup code in a new module, then import it. This is similar to @rll's answer. This is simple, to do:
Now, each caller can run this by importing the new module:
Since a module is only imported once,
set_up
is only called once.The disadvantage here is that it violates the "explicit is better than implicit" principle. I.e. if you want to call a function, call it. It isn't good practice to run code with side-effects on module-import time.
Approach 4: monkey patching
This is by far the worst of the approaches in this answer. Don't use it. But it is still a way to get the job done.
The idea is that if you don't want the function to get called after the first call, monkey-patch it (read: vandalize it) after the first call.
Where
set_up_only_once
can be implemented like:Disadvantages: your colleagues will hate you.
tl;dr:
The simplest way is to memoize using
functools.lru_cache
, but it might not be the best solution code-quality-wise. It is up to you if this solution is good enough in your case.The safest and most pythonic way, while not pleasing to the eye, is using
if __name__=='__main__': ...
.It isn't clumsy to change the
set_up
code; indeed, it is the only way to know for certain that the initialization is only done once. Here is some black magic for you, if you would rather not use anif
statement to do the job:No explicit check is required and it is assured to only ever call the function once. This isn't thread safe, but I am assuming that that isn't a requirement (since it wasn't mentioned in the question).
For the sake of completeness, I will add a solution --there are plenty of valid options here, but maybe this one fills a gap.
It is a little verbose, but I feel it is relatively simple and clean:
This way you are not actively changing the function itself... or not so much: you are adding an attribute which you check. Instead of attaching that attribute to the function you could put it somewhere else, or wrap the initialization entirely to a singleton class. But those solutions have already been proposed, if I'm not mistaken.
Three lines that should be put in every call to
set_up
.The problem would be if some code calls to
set_up
without setting that attribute (because a bug, because external dependency, whatever). But, if that is the case, then you have no other options but to change the code or to check the behaviour of the function itself.Note: I assume that the framework's function
set_up
is Pure Python. I assume that this won't work for C extension functions or built-ins, but I have not checked those.You can use singletons. A Singleton class gets created only once, and subsequent calls will point to the same object instead of creating a new one. This answer explains different ways of creating a singleton class (all simple).
I personally prefer the base class approach. You first define Singleton class as below:
then use it as a meta class like this:
Now the first time you call MyClass() it will create the object for you. Subsequent calls will refer to the same object (sort of like a global variable of classes!)
Just slap
@memoize
on yourset_up
and then it is only called once :)