I have to pickle an array of objects like this:
import cPickle as pickle
from numpy import sin, cos, array
tmp = lambda x: sin(x)+cos(x)
test = array([[tmp,tmp],[tmp,tmp]],dtype=object)
pickle.dump( test, open('test.lambda','w') )
and it gives the following error:
TypeError: can't pickle function objects
Is there a way around that?
The built-in pickle module is unable to serialize several kinds of python objects (including lambda functions, nested functions, and functions defined at the command line).
The picloud package includes a more robust pickler, that can pickle lambda functions.
from pickle import dumps
f = lambda x: x * 5
dumps(f) # error
from cloud.serialization.cloudpickle import dumps
dumps(f) # works
PiCloud-serialized objects can be de-serialized using the normal pickle/cPickle load
and loads
functions.
Dill also provides similar functionality
>>> import dill
>>> f = lambda x: x * 5
>>> dill.dumps(f)
'\x80\x02cdill.dill\n_create_function\nq\x00(cdill.dill\n_unmarshal\nq\x01Uec\x01\x00\x00\x00\x01\x00\x00\x00\x02\x00\x00\x00C\x00\x00\x00s\x08\x00\x00\x00|\x00\x00d\x01\x00\x14S(\x02\x00\x00\x00Ni\x05\x00\x00\x00(\x00\x00\x00\x00(\x01\x00\x00\x00t\x01\x00\x00\x00x(\x00\x00\x00\x00(\x00\x00\x00\x00s\x07\x00\x00\x00<stdin>t\x08\x00\x00\x00<lambda>\x01\x00\x00\x00s\x00\x00\x00\x00q\x02\x85q\x03Rq\x04c__builtin__\n__main__\nU\x08<lambda>q\x05NN}q\x06tq\x07Rq\x08.'
You'll have to use an actual function instead, one that is importable (not nested inside another function):
import cPickle as pickle
from numpy import sin, cos, array
def tmp(x):
return sin(x)+cos(x)
test = array([[tmp,tmp],[tmp,tmp]],dtype=object)
pickle.dump( test, open('test.lambda','w') )
The function object could still be produced by a lambda
expression, but only if you subsequently give the resulting function object the same name:
tmp = lambda x: sin(x)+cos(x)
tmp.__name__ = 'tmp'
test = array([[tmp, tmp], [tmp, tmp]], dtype=object)
because pickle
stores only the module and name for a function object; in the above example, tmp.__module__
and tmp.__name__
now point right back at the location where the same object can be found again when unpickling.
There is another solution: define you functions as strings, pickle/un-pickle then use eval, ex:
import cPickle as pickle
from numpy import sin, cos, array
tmp = "lambda x: sin(x)+cos(x)"
test = array([[tmp,tmp],[tmp,tmp]],dtype=object)
pickle.dump( test, open('test.lambda','w') )
mytmp = array([[eval(x) for x in l] for l in pickle.load(open('test.lambda','r'))])
print mytmp
# yields : [[<function <lambda> at 0x00000000033D4DD8>
# <function <lambda> at 0x00000000033D4E48>]
# [<function <lambda> at 0x00000000033D4EB8>
# <function <lambda> at 0x00000000033D4F28>]]
This could be more convenient for other solutions because the pickled representation would be completely self contained without having to use external dependencies.