import urllib.parse fails when Python run from com

2019-02-22 09:39发布

问题:

I have observed the following behavior in python 3.4.2, and I am unableto explain it. Hopefully someone could shed some light on the matter:

In IPython:

In [129]: import urllib

In [130]: print(urllib.parse)
<module 'urllib.parse' from '/Users/ashwin/.pyenv/versions/3.4.2/lib/python3.4/urllib/parse.py'>

I've imported a module, and printed one of its attributes. Everything works as expected. So far, life is good.

Now, I do the same thing from the command line:

$ python -c 'import urllib; print(urllib.parse)'  
Traceback (most recent call last):
  File "<string>", line 1, in <module>
AttributeError: 'module' object has no attribute 'parse'

Say what?! that's not how that's supposed to work.
Ok, maybe this is a python-wide behavior; maybe modules are not immediately imported when using the -c flag. Let's try another module:

$ python -c 'import datetime; print(datetime.datetime)'
<class 'datetime.datetime'>

What?! How does it work for datetime and not for urllib? I'm using the same version of python in both places (3.4.2)

Does anyone have any thoughts on this?

EDIT:

Per one of the comments:

$ which -a ipython
/Users/ashwin/.pyenv/shims/ipython
/Library/Frameworks/Python.framework/Versions/2.7/bin/ipython
/usr/local/bin/ipython
/usr/local/bin/ipython

And

$ which -a python
/Users/ashwin/.pyenv/shims/python
/Library/Frameworks/Python.framework/Versions/2.7/bin/python
/usr/bin/python
/usr/bin/python

回答1:

When you run import urllib, it creates the module object of the urllib module (which is actually a package) without importing its submodules (parse, request etc.).

You need the parent module object (urllib) to be in your namespace if you want to access its submodule using attribute access. In addition to that, that submodule must already be loaded (imported). From the documentation:

if package spam has a submodule foo, after importing spam.foo, spam will have an attribute foo which is bound to the submodule. [...] The invariant holding is that if you have sys.modules['spam'] and sys.modules['spam.foo'] (as you would after the above import), the latter must appear as the foo attribute of the former.

There is only one instance of each module, thus any changes made to the urllib module object (stored in sys.modules['urllib']) get reflected everywhere.

You don't import urllib.parse, but IPython does. To prove this I'm going to create a startup file:

import urllib
print('Running the startup file: ', end='')
try:
    # After importing  'urllib.parse' ANYWHERE,
    # 'urllib' will have the 'parse' attribute.
    # You could also do "import sys; sys.modules['urllib'].parse"
    urllib.parse
except AttributeError:
    print("urllib.parse hasn't been imported yet")
else:
    print('urllib.parse has already been imported')
print('Exiting the startup file.')

and launch ipython

vaultah@base:~$ ipython
Running urllib/parse.py
Running the startup file: urllib.parse has already been imported
Exiting the startup file.
Python 3.6.0a0 (default:089146b8ccc6, Sep 25 2015, 14:16:56) 
Type "copyright", "credits" or "license" for more information.

IPython 4.0.0 -- An enhanced Interactive Python.

It is the side effect of importing pydoc during the startup of IPython (which ipython is /usr/local/bin/ipython):

/usr/local/bin/ipython, line 7:
  from IPython import start_ipython
/usr/local/lib/python3.6/site-packages/IPython/__init__.py, line 47:
  from .core.application import Application
/usr/local/lib/python3.6/site-packages/IPython/core/application.py, line 24:
  from IPython.core import release, crashhandler
/usr/local/lib/python3.6/site-packages/IPython/core/crashhandler.py, line 28:
  from IPython.core import ultratb
/usr/local/lib/python3.6/site-packages/IPython/core/ultratb.py, line 90:
  import pydoc
/usr/local/lib/python3.6/pydoc.py, line 68:
  import urllib.parse

This explains why the below code fails - you only import urllib and nothing seems to import urllib.parse:

$ python -c 'import urllib; print(urllib.parse)'

On the other hand, the following command works because datetime.datetime is not a module. It's a class that gets imported during import datetime.

$ python -c 'import datetime; print(datetime.datetime)'


回答2:

urllib.parse is available from Python 3 onwards. I think you might need to import urllib.parse, not import urllib. Not sure if (when) submodule import is implicit.

I would guess IPython imports urllib.parse on startup and that is why it is available.

parse is a module not an attribute:

Python 3.4.2 (default, Oct 15 2014, 22:01:37)
[GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib
>>> urllib.parse
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute 'parse'
>>> import urllib.parse
>>> urllib.parse
<module 'urllib.parse' from '/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/urllib/parse.py'>