This is a newbie question:
A class is an object, so I can create a class called pippo()
and inside of this add function and parameter, I don't understand if the functions inside of pippo
are executed from up to down when I assign x=pippo()
or I must call them as x.dosomething()
outside of pippo
.
Working with Python's multiprocessing package, is it better to define a big function and create the object using the target
argument in the call to Process()
, or to create your own process class by inheriting from Process
class?
I often wondered why Python's doc page on multiprocessing only shows the "functional" approach (using target
parameter). Probably because terse, succinct code snippets are best for illustration purposes. For small tasks that fit in one function, I can see how that is the preferred way, ala:
from multiprocessing import Process
def f():
print('hello')
p = Process(target=f)
p.start()
p.join()
But when you need greater code organization (for complex tasks), making your own class is the way to go:
from multiprocessing import Process
class P(Process):
def __init__(self):
super(P, self).__init__()
def run(self):
print('hello')
p = P()
p.start()
p.join()
Bear in mind that each spawned process is initialized with a copy of the memory footprint of the master process. And that the constructor code (i.e. stuff inside __init__()
) is executed in the master process -- only code inside run()
executes in separate processes.
Therefore, if a process (master or spawned) changes it's member variable, the change will not be reflected in other processes. This, of course, is only true for bulit-in types, like bool
, string
, list
, etc. You can however import "special" data structures from multiprocessing
module which are then transparently shared between processes (see Sharing state between processes.) Or, you can create your own channels of IPC (inter-process communication) such as multiprocessing.Pipe
and multiprocessing.Queue
.