I'm running parallel processing in Python on Windows. Here's my code:
from joblib import Parallel, delayed
def f(x):
return sqrt(x)
if __name__ == '__main__':
a = Parallel(n_jobs=2)(delayed(f)(i) for i in range(10))
Here's the error message:
Process PoolWorker-2:
Process PoolWorker-1:
Traceback (most recent call last):
File "C:\Users\yoyo__000.BIGBLACK\AppData\Local\Enthought\Canopy\App\appdata\canopy-1.5.4.3105.win-x86_64\lib\multiprocessing\process.py", line 258, in _bootstrap
self.run()
File "C:\Users\yoyo__000.BIGBLACK\AppData\Local\Enthought\Canopy\App\appdata\canopy-1.5.4.3105.win-x86_64\lib\multiprocessing\process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "C:\Users\yoyo__000.BIGBLACK\AppData\Local\Enthought\Canopy\App\appdata\canopy-1.5.4.3105.win-x86_64\lib\multiprocessing\pool.py", line 102, in worker
task = get()
File "C:\Users\yoyo__000.BIGBLACK\AppData\Local\Enthought\Canopy\User\lib\site-packages\joblib\pool.py", line 363, in get
return recv()
AttributeError: 'module' object has no attribute 'f'
According to this site the problem is Windows specific:
Yes: under linux we are forking, thus their is no need to pickle the
function, and it works fine. Under windows, the function needs to be
pickleable, ie it needs to be imported from another file. This is
actually good practice: making modules pushes for reuse.
I've tried your code and it works flawlessly under Linux.
Under Windows it runs OK if it is run from a script, like python script_with_your_code.py
. But it fails when ran in an interactive python session. It worked for me when I saved the f
function in separate module and imported it into my interactive session.
NOT WORKING:
Interactive session:
>>> from math import sqrt
>>> from joblib import Parallel, delayed
>>> def f(x):
... return sqrt(x)
>>> if __name__ == '__main__':
... a = Parallel(n_jobs=2)(delayed(f)(i) for i in range(10))
...
Process PoolWorker-1:
Traceback (most recent call last):
File "C:\Python27\lib\multiprocessing\process.py", line 258, in _bootstrap
self.run()
File "C:\Python27\lib\multiprocessing\process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "C:\Python27\lib\multiprocessing\pool.py", line 102, in worker
task = get()
File "C:\Python27\lib\site-packages\joblib\pool.py", line 359, in get
return recv()
AttributeError: 'module' object has no attribute 'f'
WORKING:
fun.py
from math import sqrt
def f(x):
return sqrt(x)
Interactive session:
>>> from joblib import Parallel, delayed
>>> from fun import f
>>> if __name__ == '__main__':
... a = Parallel(n_jobs=2)(delayed(f)(i) for i in range(10))
...
>>> a
[0.0, 1.0, 1.4142135623730951, 1.7320508075688772, 2.0, 2.23606797749979, 2.449489742783178, 2.6457513110645907, 2.8284271247461903, 3.0]