I've made a pickle file using the following.
from PIL import Image
import pickle
import os
import numpy
import time
trainpixels = numpy.empty([80000,6400])
trainlabels = numpy.empty(80000)
validpixels = numpy.empty([10000,6400])
validlabels = numpy.empty(10000)
testpixels = numpy.empty([10408,6400])
testlabels = numpy.empty(10408)
i=0
tr=0
va=0
te=0
for (root, dirs, filenames) in os.walk(indir1):
print 'hello'
for f in filenames:
try:
im = Image.open(os.path.join(root,f))
Imv=im.load()
x,y=im.size
pixelv = numpy.empty(6400)
ind=0
for ii in range(x):
for j in range(y):
temp=float(Imv[j,ii])
temp=float(temp/255.0)
pixelv[ind]=temp
ind+=1
if i<40000:
trainpixels[tr]=pixelv
tr+=1
elif i<45000:
validpixels[va]=pixelv
va+=1
else:
testpixels[te]=pixelv
te+=1
print str(i)+'\t'+str(f)
i+=1
except IOError:
continue
trainimage=(trainpixels,trainlabels)
validimage=(validpixels,validlabels)
testimage=(testpixels,testlabels)
output=open('data.pkl','wb')
pickle.dump(trainimage,output)
pickle.dump(validimage,output)
pickle.dump(testimage,output)
Now I'm unpickling with load_data() function of the following code: http://www.deeplearning.net/tutorial/code/logistic_sgd.py which is called by running http://www.deeplearning.net/tutorial/code/rbm.py
but it returns the following error.
cPickle.UnpicklingError: A load persistent id instruction was encountered,
but no persistent_load function was specified.
It seems like data structure is unmatched, but I can' figure out how it should be..
For reference, the size of the pickle file is over 16GB, with its gzip over 1GB
I've found that pickling and unpickling is smart. Here you don't unpickle the same way you pickle, so it cannot work. In your code you pickle objects one after the other in the same file. You pickled three times to the same file. If you want to read them back, you have to make sequential reading. What you have to do is open the file for unpickling, then
pickle.load
each of your objects sequentially.You might want to try a simpler code where
train_set, valid_set, test_set
(do the pickling and unpickling with gzip) are simple picklable objects, just to be sure.