What is the most pythonic way to deal with a module in which methods must be called in a certain order?
As an example, I have an XML configuration that must be read before doing anything else because the configuration affects behavior. The parse_config()
must be called first with the config file provided. Calling other supporting methods like query_data()
won't work until parse_config()
has been called.
I first implemented this as a singleton to ensure that a filename for the config is passed at time of initialization, but noticing that modules are actually singletons, it's no longer a class, but just a regular module.
What's the best way to enforce the parse_config
being called first in a module?
Edit: Worth noting is that the function is actually parse_config(configfile)
A module doesn't do anything it isn't told to do so put your function calls at the bottom of the module so that when you import it, things get ran in the order you specify:
test.py
testmod.py
When you run test.py, you'll see fun1 is ran before fun2:
If the object isn't valid before it's called, then call that method in
__init__
(or use a factory function). You don't need any silly singletons, that's for sure.As Cat Plus Plus said in other words, wrap the behaviour/functions up in a class and put all the required setup in the
__init__
method. You might complain that the functions don't seem like they naturally belong together in an object and, hence, this is bad OO design. If that's the case, think of your class/object as a form of name-spacing. It's much cleaner and more flexible than trying to enforce function calling order somehow or using singletons.What it comes down to is how friendly do you want your error messages to be if a function is called before it is configured.
Least friendly is to do nothing extra, and let the functions fail noisily with
AttributeError
s,IndexError
s, etc.Most friendly would be having stub functions that raise an informative exception, such as a custom
ConfigError: configuration not initialized
. When theConfigParser()
function is called it can then replace the stub functions with real functions. Something like this:If you have very many functions you can add decorators to keep track of the function names, but that's an answer for a different question. ;)
Okay, I couldn't resist -- here is the decorator version, which makes my solution much easier to actually implement:
And a sample run:
and the results:
As you can see, the decorator works by remembering the functions it decorates, and instead of returning a decorated function it returns a simple stub that raises a
ConfigError
if it is ever called. When theparse_config()
routine is called, it needs to call thego_live()
method which will then replace all the error raising stubs with the actual remembered functions.The model I have been using is that subsequent functions are only available as methods on the return value of previous functions, like this:
Properly designed, this style (which I admit is not terribly Pythonic, yet) makes for fluent libraries to handle complex pipeline operations where later steps in the library require both the results of early calculations and fresh input from the calling function.
An interesting result is error handling. What I've found is the best way of handling well-understood errors in pipeline steps is having a blank Error class that supposedly can handle every function in the pipeline (except initial one) but those functions (except possibly terminal ones) return only
self
:In your example, you'd have the Parser class, which would have almost nothing but a
configure
function that returned an instance of a ConfiguredParser class, which would do all the thing that only a properly configured parser could do. This gives you access to such things as multiple configurations and handling failed attempts at configuration.The simple requirement that a module needs to be "configured" before it is used is best handled by a class which does the "configuration" in the
__init__
method, as in the currently-accepted answer. Other module functions become methods of the class. There is no benefit in trying to make a singleton ... the caller may well want to have two or more differently-configured gadgets operating simultaneously.Moving on from that to a more complicated requirement, such as a temporal ordering of the methods:
This can be handled in a quite general fashion by maintaining state in attributes of the object, as is usually done in any OOPable language. Each method that has prerequisites must check that those prequisites are satisfied.
Poking in replacement methods is an obfuscation on a par with the COBOL ALTER verb, and made worse by using decorators -- it just wouldn't/shouldn't get past code review.