To preface, I think I may have figured out how to get this code working (based on Changing module variables after import), but my question is really about why the following behavior occurs so I can understand what to not do in the future.
I have three files. The first is mod1.py:
# mod1.py
import mod2
var1A = None
def func1A():
global var1
var1 = 'A'
mod2.func2()
def func1B():
global var1
print var1
if __name__ == '__main__':
func1A()
Next I have mod2.py:
# mod2.py
import mod1
def func2():
mod1.func1B()
Finally I have driver.py:
# driver.py
import mod1
if __name__ == '__main__':
mod1.func1A()
If I execute the command python mod1.py
then the output is None
. Based on the link I referenced above, it seems that there is some distinction between mod1.py
being imported as __main__
and mod1.py
being imported from mod2.py
. Therefore, I created driver.py
. If I execute the command python driver.py
then I get the expected output: A
. I sort of see the difference, but I don't really see the mechanism or the reason for it. How and why does this happen? It seems counterintuitive that the same module would exist twice. If I execute python mod1.py
, would it be possible to access the variables in the __main__
version of mod1.py
instead of the variables in the version imported by mod2.py
?
Regarding a practical solution for using a module optionally as main script - supporting consistent cross-imports:
Solution 1:
See e.g. in Python's pdb module, how it is run as a script by importing itself when executing as
__main__
(at the end) :Just I would recommend to reorganize the
__main__
startup to the beginning of the script like this:This way the module body is not executed twice - which is "costly", undesirable and sometimes critical.
Solution 2:
In rarer cases it is desirable to expose the actual script module
__main__
even directly as the actual module alias (mod1
):Known drawbacks:
reload(_mod)
failsfind_global
..)The
__name__
variable always contains the name of the module, except when the file has been loaded into the interpreter as a script instead. Then that variable is set to the string'__main__'
instead.After all, the script is then run as the main file of the whole program, everything else are modules imported directly or indirectly by that main file. By testing the
__name__
variable, you can thus detect if a file has been imported as a module, or was run directly.Internally, modules are given a namespace dictionary, which is stored as part of the metadata for each module, in
sys.modules
. The main file, the executed script, is stored in that same structure as'__main__'
.But when you import a file as a module, python first looks in
sys.modules
to see if that module has already been imported before. So,import mod1
means that we first look insys.modules
for themod1
module. It'll create a new module structure with a namespace ifmod1
isn't there yet.So, if you both run
mod1.py
as the main file, and later import it as a python module, it'll get two namespace entries insys.modules
. One as'__main__'
, then later as'mod1'
. These two namespaces are completely separate. Your globalvar1
is stored insys.modules['__main__']
, butfunc1B
is looking insys.modules['mod1']
forvar1
, where it isNone
.But when you use
python driver.py
,driver.py
becomes the'__main__'
main file of the program, andmod1
will be imported just once into thesys.modules['mod1']
structure. This time round,func1A
storesvar1
in thesys.modules['mod1']
structure, and that's whatfunc1B
will find.