I have the following function:
def copy_file(source_file, target_dir):
pass
Now I would like to use multiprocessing
to execute this function at once:
p = Pool(12)
p.map(lambda x: copy_file(x,target_dir), file_list)
The problem is, lambda's can't be pickled, so this fails. What is the most neat (pythonic) way to fix this?
Use a function object:
class Copier(object):
def __init__(self, tgtdir):
self.target_dir = tgtdir
def __call__(self, src):
copy_file(src, self.target_dir)
To run your Pool.map
:
p.map(Copier(target_dir), file_list)
For Python2.7+ or Python3, you could use functools.partial:
import functools
copier = functools.partial(copy_file, target_dir=target_dir)
p.map(copier, file_list)
Question is a bit old but if you are still use Python 2 my answer can be useful.
Trick is to use part of pathos project: multiprocess fork of multiprocessing. It get rid of annoying limitation of original multiprocess.
Installation: pip install multiprocess
Usage:
>>> from multiprocess import Pool
>>> p = Pool(4)
>>> print p.map(lambda x: (lambda y:y**2)(x) + x, xrange(10))
[0, 2, 6, 12, 20, 30, 42, 56, 72, 90]
From this answer, pathos let's you run your lambda p.map(lambda x: copy_file(x,target_dir), file_list)
directly, saving all the workarounds / hacks