I am new to multiprocessing concepts in python and I have problem accessing variables when I try to include multiprocessing in my code. Sorry if Iam sounding naive, but I just cant figure it out. Below is a simple version of my scenario.
class Data:
def __init__(self):
self.data = "data"
def datameth(self):
print self.data
print mainvar
class First:
def __init__(self):
self.first = "first"
def firstmeth(self):
d = Data()
d.datameth()
print self.first
def mymethod():
f = First()
f.firstmeth()
if __name__ == '__main__':
mainvar = "mainvar"
mymethod()
When I run this, its running fine and gives the output:
data
mainvar
first
But when I try to run mymethod()
as a process
from multiprocessing import Process
class Data:
def __init__(self):
self.data = "data"
def datameth(self):
print self.data
print mainvar
class First:
def __init__(self):
self.first = "first"
def firstmeth(self):
d = Data()
#print mainvar
d.datameth()
print self.first
def mymethod():
f = First()
f.firstmeth()
if __name__ == '__main__':
mainvar = "mainvar"
#mymethod()
p = Process(target = mymethod)
p.start()
I get an error like this:
NameError: global name 'mainvar' is not defined
The point is, Iam not able to access mainvar
from inside First
class or Data
class.
What am I missing here?
Edit:
Actually in my real scenario, it is not just declaring mainvar, it is the return value of a method after some processing.
if __name__ == '__main__':
***some other stuff***
mainvar = ***return value of some method**
p = Process(target = mymethod)
p.start()
Edit 2:
As @dciriello mentioned in comments, It is working fine in Linux but not in Windows :(
This is a limitation of Windows, because it doesn't support fork
. When a child process is forked in Linux, it gets a copy-on-write replica of the parent's processes state, so the mainvar
you defined inside if __name__ == "__main__":
will be there. However, on Windows, the child process' state is created by re-importing the __main__
module of the program. This means that mainvar
doesn't exist in the children, because it's only created inside the if __name__ == "__main__"
guard. So, if you need to access mainvar
inside a child process, your only option is to explicitly pass it to the child as an argument to mymethod
in the Process
constructor:
mainvar = "whatever"
p = Process(target=mymethod, args=(mainvar,))
This best-practice is mentioned in the multiprocessing
docs:
Explicitly pass resources to child processes
On Unix a child process can make use of a shared resource created in a
parent process using a global resource. However, it is better to pass
the object as an argument to the constructor for the child process.
Apart from making the code (potentially) compatible with Windows this
also ensures that as long as the child process is still alive the
object will not be garbage collected in the parent process.
Notice the bold part - though it's not quite spelled out, the reason it helps with Windows compatibility is because it helps avoid the exact issue you're seeing.
This is also covered in the section of the docs that talks specifically about Windows limitations caused by the lack of fork
:
Global variables
Bear in mind that if code run in a child process tries to access a
global variable, then the value it sees (if any) may not be the same
as the value in the parent process at the time that Process.start
was
called.
However, global variables which are just module level constants cause
no problems.
Note the "if any". Because your global variable is declared inside the if __name__ == "__main__":
guard, it doesn't even show up in the child.
Operating systems don't allow processes to share variables easily. If they would, then each process could steal data from any other process and you never want that (like when you enter your credit card details in a web browser).
So when you use the multiprocessing
module, you have to use special facilities to share variables (a.k.a "state") between the individual processes like Value
and Array
. See the documentation for details.
you are using 'mainvar' at wrong place,
Try below:
from multiprocessing import Process
mainvar = "mainvar"
class Data:
def __init__(self):
self.data = "data"
def datameth(self):
print self.data
print mainvar
class First:
def __init__(self):
self.first = "first"
def firstmeth(self):
d = Data()
#print mainvar
d.datameth()
print self.first
def mymethod():
f = First()
f.firstmeth()
if __name__ == '__main__':
#mainvar = "mainvar"
#mymethod()
p = Process(target = mymethod)
p.start()