How to let Pool.map take a lambda function

2019-01-08 22:15发布

问题:

I have the following function:

def copy_file(source_file, target_dir):
    pass

Now I would like to use multiprocessing to execute this function at once:

p = Pool(12)
p.map(lambda x: copy_file(x,target_dir), file_list)

The problem is, lambda's can't be pickled, so this fails. What is the most neat (pythonic) way to fix this?

回答1:

Use a function object:

class Copier(object):
    def __init__(self, tgtdir):
        self.target_dir = tgtdir
    def __call__(self, src):
        copy_file(src, self.target_dir)

To run your Pool.map:

p.map(Copier(target_dir), file_list)


回答2:

For Python2.7+ or Python3, you could use functools.partial:

import functools
copier = functools.partial(copy_file, target_dir=target_dir)
p.map(copier, file_list)


回答3:

Question is a bit old but if you are still use Python 2 my answer can be useful.

Trick is to use part of pathos project: multiprocess fork of multiprocessing. It get rid of annoying limitation of original multiprocess.

Installation: pip install multiprocess

Usage:

>>> from multiprocess import Pool
>>> p = Pool(4)
>>> print p.map(lambda x: (lambda y:y**2)(x) + x, xrange(10))
[0, 2, 6, 12, 20, 30, 42, 56, 72, 90]


回答4:

From this answer, pathos let's you run your lambda p.map(lambda x: copy_file(x,target_dir), file_list) directly, saving all the workarounds / hacks