I'm starting to experiment with the IPython parallel tools and have an issue. I start up my python engines with:
ipcluster start -n 3
Then the following code runs fine:
from IPython.parallel import Client
def dop(x):
rc = Client()
dview = rc[:]
dview.block=True
dview.execute('a = 5')
dview['b'] = 10
ack = dview.apply(lambda x: a+b+x, x)
return ack
ack = dop(27)
print ack
returns [42, 42, 42] as it should. But if I break the code into different files:
dop.py:
from IPython.parallel import Client
def dop(x):
rc = Client()
dview = rc[:]
dview.block=True
dview.execute('a = 5')
dview['b'] = 10
print dview['a']
ack = dview.apply(lambda x: a+b+x, x)
return ack
and try the following:
from dop import dop
ack = dop(27)
print ack
I get errors from each engine:
[0:apply]: NameError: global name 'a' is not defined
[1:apply]: NameError: global name 'a' is not defined
[2:apply]: NameError: global name 'a' is not defined
I don't get it...why can't I put the function in a different file and import it?
Quick answer: decorate your function with @interactive
from IPython.parallel.util
[1] if you want it to have access to the engine's global namespace:
from IPython.parallel.util import interactive
f = interactive(lambda x: a+b+x)
ack = dview.apply(f, x)
The actual explanation:
the IPython user namespace is essentially the module __main__
. This is where code is run when you do execute('a = 5')
.
If you define a function interactively, its module is also __main__
:
lam = lambda x: a+b+x
lam.__module__
'__main__'
When the Engine unserializes a function, it does so in the appropriate global namespace for the function's module, so functions defined in __main__
in your client are also defined in __main__
on the Engine, and thus have access to a
.
Once you put it in a file and import it, then the functions are no longer attached to __main__
, but the module dop
:
from dop import dop
dop.__module__
'dop'
All functions conventionally defined in that module (lambdas included) will have this value, so when they are unpacked on the Engine their global namespace will be that of the dop
module, not __main__
, so your 'a' is not accessible.
For this reason, IPython provides a simple @interactive
decorator that results in any function being unpacked as if it were defined in __main__
, regardless of where the function is actually defined.
For an example of the difference, take this dop.py
:
from IPython.parallel import Client
from IPython.parallel.util import interactive
a = 1
def dop(x):
rc = Client()
dview = rc[:]
dview['a'] = 5
f = lambda x: a+x
return dview.apply_sync(f, x)
def idop(x):
rc = Client()
dview = rc[:]
dview['a'] = 5
f = interactive(lambda x: a+x)
return dview.apply_sync(f, x)
Now, dop
will use 'a' from the dop module, and idop
will use 'a' from your engine namespaces. The only difference between the two is that the function passed to apply is wrapped in @interactive
:
from dop import dop, idop
print dop(5) # 6
print idop(5) # 10
[1]: In IPython >= 0.13 (upcoming release), @interactive
is also available as from IPython.parallel import interactive
, where it always should have been.