Alternatives to imp.find_module?

2019-04-09 00:25发布

Background

I've grown tired of the issue with pylint not being able to import files when you use namespace packages and divide your code-base into separate folders. As such I started digging into the astNG source-code which has been identified as the source of the trouble (see bugreport 8796 on astng). At the heart of the issue seems to be the use of pythons own imp.find_module in the process of finding imports.

What happens is that the import's first (sub)package - a in import a.b.c - is fed to find_module with a None path. Whatever path comes back is then fed into find_module the next pass in the look up loop where you try to find b in the previous example.

Pseudo-code from logilab.common.modutils:

path = None
while import_as_list:
      try:
           _, found_path, etc = find_module(import_as_list[0], path)
      #exception handling and checking for a better version in the .egg files
      path = [found_path]
      import_as_list.pop(0)

The Problem

This is what's broken: you only get the first best hit from find_module, which may or may not have your subpackages in it. If you DON'T find the subpackages, you have no way to back out and try the next one.

I tried explicitly using sys.path instead of None, so that the result could be removed from the path list and a second attempt be made, but python's module finder is clever enough that there doesn't have to be an exact match in the paths, making this approach unusable - to the best of my knowledge anyway.

Teary-eyed Plea

Is there an alternative to find_modules which will return ALL possible matches or take an exclude list? I'm also open to completely different solutions. Preferably not patching python by hand, but it wouldn't be impossible - at least for a local solution.

(Caveat emptor: I'm running python 2.6 and for reasons of current company policy can't upgrade, suggestions for p3k etc won't get marked as accepted unless it's the only answer.)

3条回答
ら.Afraid
2楼-- · 2019-04-09 01:18

I've grown tired of this limitation in PyLint too.

I don't know a replacement for imp.find_modules(), but I think I found another way to deal with namespace packages in PyLint. See my comment on the bug report you linked to (http://www.logilab.org/ticket/8796).

The idea is to use pkg_resources to find namespace packages. Here's my addition to logilab.common.modutils._module_file(), just after while modpath:

  while modpath:
      if modpath[0] in pkg_resources._namespace_packages and len(modpath) > 1:
          module = sys.modules[modpath.pop(0)]
          path = module.__path__

This not very refined and only handles top-level namespace packages though.

查看更多
霸刀☆藐视天下
3楼-- · 2019-04-09 01:21

warning + disclaimer: not tested yet!

before:

for part in parts:
    modpath.append(part)
    curname = '.'.join(modpath)
    # ...
    if module is None:
        mp_file, mp_filename, mp_desc = imp.find_module(part, path)
        module = imp.load_module(curname, mp_file, mp_filename, mp_desc)

after: - thanks pjeby for mentioning pkgutil!

for part in parts:
    modpath.append(part)
    curname = '.'.join(modpath)
    # ...
    if module is None:
        # + https://stackoverflow.com/a/14820895/611007
        # # mp_file, mp_filename, mp_desc = imp.find_module(part, path)
        # # module = imp.load_module(curname, mp_file, mp_filename, mp_desc)
        import pkgutil
        mp_file = None
        for loadr,name,ispkg in pkgutil.iter_modules(path=path,prefix='.'.join(modpath[:-1])+'.'):
            if name.split('.')[-1] == part:
                if not hasattr(loadr,'path') and hasattr(loadr,'archive'):
                    # with zips `name` was like '.somemodule'
                    # it gives `RuntimeWarning: Parent module '' not found while handling absolute import`
                    # I expect the name I need to be 'somemodule'
                    # TODO: I don't know why python does this or what the correct usage is.
                    # https://stackoverflow.com/questions/2267984/
                    if name and name[0] == '.':
                        name = name[1:]
                    ldr= loadr.find_module(name,loadr.archive)
                    module = ldr.load_module(name)
                    break
                imploader= loadr.find_module(name,loadr.path)
                mp_file,mp_filename,mp_desc= imploader.file,imploader.filename,imploader.etc
                module = imploader.load_module(imploader.fullname)
                break
        if module is None:
            raise ImportError
查看更多
干净又极端
4楼-- · 2019-04-09 01:23

Since Python 2.5, the right way to do this is with pkgutil.iter_modules() (for a flat list) or pkgutil.walk_packages() (for a subpackage tree). Both are fully compatible with namespace packages.

For example, if I wanted to find just the subpackages/submodules of 'jmb', I would do:

import jmb, pkgutil
for (module_loader, name, ispkg) in pkgutil.iter_modules(jmb.__path__, 'jmb.'):
    # 'name' will be 'jmb.foo', 'jmb.bar', etc.
    # 'ispkg' will be true if 'jmb.foo' is a package, false if it's a module

You can also use iter_modules or walk_packages to walk all the modules on sys.path; see the docs linked above for details.

查看更多
登录 后发表回答