I want to do multiprocessing in the class. It seems like only pathos.multiprocessing is able to help me. However, when I implement it, it can't load the packages I use in the main function.
from pathos.multiprocessing import ProcessingPool;
import time
import sys;
import datetime
class tester:
def __init__(self):
self.pool=ProcessingPool(2);
def func(self,msg):
print (str(datetime.datetime.now()));
for i in xrange(1):
print msg
sys.stdout.flush();
time.sleep(2)
#----------------------------------------------------------------------
def worker(self):
""""""
pool=self.pool
for i in xrange(10):
msg = "hello %d" %(i)
pool.map(self.func,[i])
pool.close()
pool.join()
time.sleep(40)
if __name__ == "__main__":
print datetime.datetime.now();
t=tester()
t.worker()
time.sleep(60);
print "Sub-process(es) done."
the wrong is that global name 'datetime' is not defined. But it works in the main function!
My sys is Win7.
I'm the author of pathos
. If you execute your code on non-windows systems, it works fine -- even from the interpreter. (It also works from a file, as is too).
>>> from pathos.multiprocessing import ProcessingPool;
>>> import time
>>> import sys;
>>> import datetime
>>> class tester:
... def __init__(self):
... self.pool=ProcessingPool(2);
... def func(self,msg):
... print (str(datetime.datetime.now()));
... for i in xrange(1):
... print msg
... sys.stdout.flush();
... time.sleep(2)
... def worker(self):
... """"""
... pool=self.pool
... for i in xrange(10):
... msg = "hello %d" %(i)
... pool.map(self.func,[i])
... pool.close()
... pool.join()
... time.sleep(40)
...
>>> datetime.datetime.now()
datetime.datetime(2015, 10, 21, 19, 24, 16, 131225)
>>> t = tester()
>>> t.worker()
2015-10-21 19:24:25.927781
0
2015-10-21 19:24:27.933611
1
2015-10-21 19:24:29.938630
2
2015-10-21 19:24:31.942376
3
2015-10-21 19:24:33.946052
4
2015-10-21 19:24:35.949965
5
2015-10-21 19:24:37.953877
6
2015-10-21 19:24:39.957770
7
2015-10-21 19:24:41.961704
8
2015-10-21 19:24:43.965193
9
>>>
The issue is that multiprocessing
fundamentally is different on windows, in that windows doesn't have a true fork
… and thus isn't as flexible as on systems with a fork
. multiprocessing
has a forking pickler, that under the covers spawns a subprocess
… while non-windows systems can utilize shared memory across the processes.
dill
has a check
and a copy
method that does a sequential loads(dumps(object))
on some object
, where copy
uses shared memory, while check
uses subprocess
(as is done on windows in multiprocessing
). Here's the check
method on a mac, so apparently that's not the issue.
>>> import dill
>>> dill.check(t.func)
<bound method tester.func of <__main__.tester instance at 0x1051c7998>>
The other thing you need to do on windows, is to use freeze_support
at the beginning of __main__
(i.e. the first line of __main__
). It's unnecessary on non-windows systems, but pretty much necessary on windows. Here's the doc.
>>> import pathos
>>> print pathos.multiprocessing.freeze_support.__doc__
Check whether this is a fake forked process in a frozen executable.
If so then run code specified by commandline and exit.
>>>