I would like to be able to pickle a function or class from within __main__, with the obvious problem (mentioned in other posts) that the pickled function/class is in the __main__ namespace and unpickling in another script/module will fail.
I have the following solution which works, is there a reason this should not be done?
The following is in myscript.py:
import myscript
import pickle
if __name__ == "__main__":
print pickle.dumps(myscript.myclass())
else:
class myclass:
pass
edit: The unpickling would be done in a script/module that has access to myscript.py and can do an import myscript
. The aim is to use a solution like parallel python to call functions remotely, and be able to write a short, standalone script that contains the functions/classes that can be accessed remotely.
You can get a better handle on global objects by importing
__main__
, and using the methods available in that module. This is what dill does in order to serialize almost anything in python. Basically, when dill serializes an interactively defined function, it uses some name mangling on__main__
on both the serialization and deserialization side that makes__main__
a valid module.Actually, dill registers it's types into the
pickle
registry, so if you have some black box code that usespickle
and you can't really edit it, then just importing dill can magically make it work without monkeypatching the 3rd party code.Or, if you want the whole interpreter session sent over as an "python image", dill can do that too.
You can easily send the image across ssh to another computer, and start where you left off there as long as there's version compatibility of pickle and the usual caveats about python changing and things being installed.
I actually use dill to serialize objects and send them across parallel resources with parallel python, multiprocessing, and mpi4py. I roll these up conveniently into the pathos package (and pyina for MPI), which provides a uniform
map
interface for different parallel batch processing backends.There are also non-blocking and iterative maps as well as non-parallel pipe connections. I also have a pathos module for
pp
, however, it is somewhat unstable for functions defined in__main__
. I'm working on improving that. If you like, fork the code on github and help make thepp
better for functions defined in__main__
. The reasonpp
doesn't pickle well is thatpp
does it's serialization tricks through using temporary file objects and reading the interpreter session's history... so it doesn't serialize objects in the same way that multiprocessing or mpi4py do. I have a dill moduledill.source
that seamlessly does the same type of pickling thatpp
uses, but it's rather new.If you are trying to pickle something so that you can use it somewhere else, separate from
test_script
, that's not going to work, because pickle (apparently) just tries to load the function from the module. Here's an example:test_script.py
picklescript.py
If you run
python picklescript.py
, then change the filename oftest_script
, when you try to load the function, it will fail. e.g.Running this:
Will give you the following traceback:
Pickle seems to look at the main scope for definitions of classes and functions. From inside the module you're unpickling from, try this: