I ran across this case of UnboundLocalError
recently, which seems strange:
import pprint
def main():
if 'pprint' in globals(): print 'pprint is in globals()'
pprint.pprint('Spam')
from pprint import pprint
pprint('Eggs')
if __name__ == '__main__': main()
Which produces:
pprint is in globals()
Traceback (most recent call last):
File "weird.py", line 9, in <module>
if __name__ == '__main__': main()
File "weird.py", line 5, in main
pprint.pprint('Spam')
UnboundLocalError: local variable 'pprint' referenced before assignment
pprint
is clearly bound in globals
, and is going to be bound in locals
in the following statement. Can someone offer an explanation of why it isn't happy resolving pprint
to the binding in globals
here?
Edit: Thanks to the good responses I can clarify my question with relevant terminology:
At compile time the identifier pprint
is marked as local to the frame. Does the execution model have no distinction where within the frame the local identifier is bound? Can it say, "refer to the global binding up until this bytecode instruction, at which point it has been rebound to a local binding," or does the execution model not account for this?
This question got answered several weeks ago, but I think I can clarify the answers a little. First some facts.
1: In Python,
is almost exactly the same as
2: When executing code in a function, if Python encounters a variable that hasn't been defined in the function yet, it looks in the global scope.
3: Python has an optimization it uses for functions called "locals". When Python tokenizes a function, it keeps track of all the variables you assign to. It assigns each of these variables a number from a local monotonically increasing integer. When Python runs the function, it creates an array with as many slots as there are local variables, and it assigns each slot a special value that means "has not been assigned to yet", and that's where the values for those variables are stored. If you reference a local that hasn't been assigned to yet, Python sees that special value and throws an UnboundLocalValue exception.
The stage is now set. Your "from pprint import pprint" is really a form of assignment. So Python creates a local variable called "pprint" which occludes the global variable. Then, when you refer to "pprint.pprint" in the function, you hit the special value and Python throws the exception. If you didn't have that import statement in the function, Python would use the normal look-in-locals-first-then-look-in-globals resolution and find the pprint module in globals.
To disambiguate this you can use the "global" keyword. Of course by now you've already worked past your problem, and I don't know whether you really needed "global" or if some other approach was called for.
Looks like Python sees the
from pprint import pprint
line and markspprint
as a name local tomain()
before executing any code. Since Python thinks pprint ought to be a local variable, referencing it withpprint.pprint()
before "assigning" it with thefrom..import
statement, it throws that error.That's as much sense as I can make of that.
The moral, of course, is to always put those
import
statements at the top of the scope.Where's the surprise? Any variable global to a scope that you reassign within that scope is marked local to that scope by the compiler.
If imports would be handled differently, that would be surprising imho.
It may make a case for not naming modules after symbols used therein, or vice versa, though.
Well, that was interesting enough for me to experiment a bit and I read through http://docs.python.org/reference/executionmodel.html
Then did some tinkering with your code here and there, this is what i could find:
code:
output:
In the method
two()
from pprint import pprint
but does not override the namepprint
inglobals
, since theglobal
keyword is not used in the scope oftwo()
.In method
three()
since there is no declaration ofpprint
name in local scope it defaults to the global namepprint
which is a moduleWhereas in
main()
, at first the keywordglobal
is used so all references topprint
in the scope of methodmain()
will refer to theglobal
namepprint
. Which as we can see is a module at first and is overriden in theglobal
namespace
with a method as we do thefrom pprint import pprint
Though this may not be answering the question as such, but nevertheless its some interesting fact I think.
=====================
Edit Another interesting thing.
If you have a module say:
mod1
and another method say:
mod2
which at first sight is seemingly correct since you have imported the module
datetime
inmod2
.now if you try to run mod2 as a script it will throw an error:
because the second import
from mod2 import *
has overriden the namedatetime
in the namespace, hence the firstimport datetime
is not valid anymore.Moral: Thus the order of imports, the nature of imports (from x import *) and the awareness of imports within imported modules - matters.