Is there a simple way to use Multiprocessing to do the equivalent of this?
for sim in sim_list:
sim.run()
where the elements of sim_list are "simulation" objects and run() is a method of the simulation class which does modify the attributes of the objects. E.g.:
class simulation:
def __init__(self):
self.state['done']=False
self.cmd="program"
def run(self):
subprocess.call(self.cmd)
self.state['done']=True
All the sim in sim_list are independent, so the strategy does not have to be thread safe.
I tried the following, which is obviously flawed because the argument is passed by deepcopy and is not modified in-place.
from multiprocessing import Process
for sim in sim_list:
b = Process(target=simulation.run, args=[sim])
b.start()
b.join()
One way to do what you want is to have your computing class (simulation
in your case) be a subclass of Process
. When initialized properly, instances of this class will run in separate processes and you can set off a group of them from a list just like you wanted.
Here's an example, building on what you wrote above:
import multiprocessing
import os
import random
class simulation(multiprocessing.Process):
def __init__(self, name):
# must call this before anything else
multiprocessing.Process.__init__(self)
# then any other initialization
self.name = name
self.number = 0.0
sys.stdout.write('[%s] created: %f\n' % (self.name, self.number))
def run(self):
sys.stdout.write('[%s] running ... process id: %s\n'
% (self.name, os.getpid()))
self.number = random.uniform(0.0, 10.0)
sys.stdout.write('[%s] completed: %f\n' % (self.name, self.number))
Then just make a list of objects and start each one with a loop:
sim_list = []
sim_list.append(simulation('foo'))
sim_list.append(simulation('bar'))
for sim in sim_list:
sim.start()
When you run this you should see each object run in its own process. Don't forget to call Process.__init__(self)
as the very first thing in your class initialization, before anything else.
Obviously I've not included any interprocess communication in this example; you'll have to add that if your situation requires it (it wasn't clear from your question whether you needed it or not).
This approach works well for me, and I'm not aware of any drawbacks. If anyone knows of hidden dangers which I've overlooked, please let me know.
I hope this helps.
For those who will be working with large data sets, an iterable
would be your solution here:
import multiprocessing as mp
pool = mp.Pool(mp.cpu_count())
pool.imap(sim.start, sim_list)