Spawn multiprocessing.Process under different pyth

2019-04-29 11:30发布

问题:

I have two versions of Python (these are actually two conda environments)

/path/to/bin-1/python
/path/to/bin-2/python

From one version of python I want to launch a function that runs in the other version using something like the multiprocessing.Process object. It turns out that this is doable using the set_executable method:

ctx = multiprocess.get_context('spawn')
ctx.set_executable('/path/to/bin-2/python')

And indeed we can see that this does in fact launch using that executable:

def f(q):
    import sys
    q.put(sys.executable)

if __name__ == '__main__':
    import multiprocessing
    ctx = multiprocessing.get_context('spawn')
    ctx.set_executable('/path/to/bin-2/python')
    q = ctx.Queue()
    proc = ctx.Process(target=f, args=(q,))
    proc.start()
    print(q.get())

$ python foo.py
/path/to/bin-2/python

However Path is Wrong

However when I do the same thing with sys.path rather than sys.executable I find that the sys.path for the hosting python process is printed out instead, rather than the sys.path I would find from running /path/to/bin-2/python -c "import sys; print(sys.path)" directly.

I'm used to this sort of thing if I use fork. I would have expected 'spawn' to act the same as though I had entered the python interpreter from the shell.

Question

Is it possible to use the multiprocessing library to run functions and use Queues from another Python executable with the environment that it would have had had I started it from the shell?

More broadly, how does sys.path get populated and what is different between using multiprocessing in this way and launching the interpreter directly?

回答1:

I ran into the same problem. My system wide Python executable is at /path/to/bin-1/python, and I created a virtual environment using virtualenv containing another Python executable at /path/to/bin-2/python. To set up the right path / environment for the spawned process needed for /path/to/bin-2/python, I ended up copying the code from activate_this.py in the virtualenv folder to f(q).

def f(q):
    import sys, os

    def active_virtualenv(exec_path):
        """
        copy virtualenv's activate_this.py
        exec_path: the python.exe path from sys.executable
        """
        # set env. var. PATH
        old_os_path = os.environ.get('PATH', '')
        os.environ['PATH'] = os.path.dirname(os.path.abspath(exec_path)) + os.pathsep + old_os_path
        base = os.path.dirname(os.path.dirname(os.path.abspath(exec_path)))
        # site-pachages path
        if sys.platform == 'win32':
            site_packages = os.path.join(base, 'Lib', 'site-packages')
        else:
            site_packages = os.path.join(base, 'lib', 'python%s' % sys.version[:3], 'site-packages')
        # modify sys.path
        prev_sys_path = list(sys.path)
        import site
        site.addsitedir(site_packages)
        sys.real_prefix = sys.prefix
        sys.prefix = base
        # Move the added items to the front of the path:
        new_sys_path = []
        for item in list(sys.path):
            if item not in prev_sys_path:
                new_sys_path.append(item)
                sys.path.remove(item)
        sys.path[:0] = new_sys_path
        return None

    active_virtualenv(sys.executable)
    q.put(sys.executable)
    # check some unique package in this env.
    import special_package
    print "package version: {}".format(special_package.__version__)


if __name__ == '__main__':
    import multiprocessing
    multiprocessing.set_executable('/path/to/bin-2/python')
    q = multiprocessing.Queue()
    proc = multiprocessing.Process(target=f, args=(q,))
    proc.start()
    proc.join()
    print(q.get())

stdouts:

$ python foo.py
/path/to/bin-2/python
package version: unique_version_only_in_virtualenv

One thing I'm not so certain is sys and os are imported before active_virtualenv(), which means they are from system wide Python env. But other packages I need in f(q) are from virtual env. Maybe it's worth re-import them after switching env.