How to check if a module/library/package is part o

2019-02-12 16:14发布

问题:

I have installed sooo many libraries/modules/packages with pip and now I cannot differentiate which is native to the python standard library and which is not. This causes problem when my code works on my machine but it doesn't work anywhere else.

How can I check if a module/library/package that I import in my code is from the python stdlib?

Assume that the checking is done on the machine with all the external libraries/modules/packages, otherwise I could simply do a try-except import on the other machine that doesn't have them.

For example, I am sure these imports work on my machine, but when it's on a machine with only a plain Python install, it breaks:

from bs4 import BeautifulSoup
import nltk
import PIL
import gensim

回答1:

You'd have to check all modules that have been imported to see if any of these are located outside of the standard library.

The following script is not bulletproof but should give you a starting point:

import sys
import os

external = set()
exempt = set()
paths = (os.path.abspath(p) for p in sys.path)
stdlib = {p for p in paths
          if p.startswith((sys.prefix, sys.real_prefix)) 
          and 'site-packages' not in p}
for name, module in sorted(sys.modules.items()):
    if not module or name in sys.builtin_module_names or not hasattr(module, '__file__'):
        # an import sentinel, built-in module or not a real module, really
        exempt.add(name)
        continue

    fname = module.__file__
    if fname.endswith(('__init__.py', '__init__.pyc', '__init__.pyo')):
        fname = os.path.dirname(fname)

    if os.path.dirname(fname) in stdlib:
        # stdlib path, skip
        exempt.add(name)
        continue

    parts = name.split('.')
    for i, part in enumerate(parts):
        partial = '.'.join(parts[:i] + [part])
        if partial in external or partial in exempt:
            # already listed or exempted
            break
        if partial in sys.modules and sys.modules[partial]:
            # just list the parent name and be done with it
            external.add(partial)
            break

for name in external:
    print name, sys.modules[name].__file__

Put this is a new module, import it after all imports in your script, and it'll print all modules that it thinks are not part of the standard library.



回答2:

The standard library is defined in the documentation of python. You can just search there, or put the module names into a list and check programmatically with that.

Alternatively, in python3.4 there's a new isolated mode that allows to ignore a certain number of user-defined library paths. In previous versions of python you can use -s to ignore the per-user environment and -E to ignore the system defined variables.

In python2 a very simple way to check if a module is part of the standard library is to clear the sys.path:

>>> import sys
>>> sys.path = []
>>> import numpy
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named numpy
>>> import traceback
>>> import os
>>> import re

However this doesn't work in python3.3+:

>>> import sys
>>> sys.path = []
>>> import traceback
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named 'traceback'
[...]

This is because starting with python3.3 the import machinery was changed, and importing the standard library uses the same mechanism as importing any other module (see the documentation).

In python3.3 the only way to make sure that only stdlib's imports succeed is to add only the standard library path to sys.path, for example:

>>> import os, sys, traceback
>>> lib_path = os.path.dirname(traceback.__file__)
>>> sys.path = [lib_path]
>>> import traceback
>>> import re
>>> import numpy
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named 'numpy'

I used the traceback module to get the library path, since this should work on any system.

For the built-in modules, which are a subset of the stdlib modules, you can check sys.builtin_module_names



回答3:

@Bakuriu's answer was very useful to me. The only problem I experienced is if you want to check if a particular module is stdlib however is has been imported already. In that case, sys.modules will only have an entry for it so even if sys.path is stripped, the import will succeed:

In [1]: import sys

In [2]: import virtualenv

In [3]: sys.path = []

In [4]: try:
    __import__('virtualenv')
except ImportError:
    print(False)
else:
    print(True)
   ...:
True

vs

In [1]: import sys

In [2]: sys.path = []

In [3]: try:
    __import__('virtualenv')
except ImportError:
    print(False)
else:
    print(True)
   ...:
False

I whipped out the following solution which seems to work in both Python2 and Python3:

from __future__ import unicode_literals, print_function
import sys
from contextlib import contextmanager
from importlib import import_module


@contextmanager
def ignore_site_packages_paths():
    paths = sys.path
    # remove all third-party paths
    # so that only stdlib imports will succeed
    sys.path = list(filter(
        None,
        filter(lambda i: 'site-packages' not in i, sys.path)
    ))
    yield
    sys.path = paths


def is_std_lib(module):
    if module in sys.builtin_module_names:
        return True

    with ignore_site_packages_paths():
        imported_module = sys.modules.pop(module, None)
        try:
            import_module(module)
        except ImportError:
            return False
        else:
            return True
        finally:
            if imported_module:
                sys.modules[module] = imported_module

You can keep track of the source code here