试穿CIFAR10的Keras例子之后,我决定去更大的东西:在一个VGG片状网微小Imagenet数据集。 这与200类(而不是1000)和100K图像数据集ImageNet降尺度到64x64的子集。
我从文件vgg_like_convnet.py的VGG般的模型这里 。 不幸的是,事情会非常喜欢这里 ,只是这次改变学习率或交换为TH TF于事无补。 既不改变优化器(见下面的代码)。
精度基本上停留在0.005,正如有人指出,是你所期待的与200类完全随机的答案。 更糟的是,如果通过权重的init侥幸,它开始于,比如说0.007,它会迅速收敛到0.005,坚决那里停留任何后续历元。
的Keras代码(TH版)是如下所示:
from __future__ import print_function
from keras.datasets import cifar10
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D, ZeroPadding2D
from keras.regularizers import l2, activity_l2, l1, activity_l1
from keras.optimizers import SGD, Adam, Adagrad, Adadelta
from keras.utils import np_utils
import numpy as np
import cPickle as pickle
# seed = 7
# np.random.seed(seed)
batch_size = 64
nb_classes = 200
nb_epoch = 30
# input image dimensions
img_rows, img_cols = 64, 64
# the tiny image net images are RGB
img_channels = 3
# Load the train dataset for TH
print('Load training data')
X_train=pickle.load(open('xtrain_shu_th.p','rb')) # np.zeros((100000,3,64,64)).astype('uint8')
y_train=pickle.load(open('ytrain_shu_th.p','rb')) # np.zeros((100000,1)).astype('uint8')
# Load the test dataset for TH
print('Load validation data')
X_test=pickle.load(open('xtest_th.p','rb')) # np.zeros((10000,3,64,64)).astype('uint8')
y_test=pickle.load(open('ytest_th.p','rb')) # np.zeros((10000,1)).astype('uint8')
# the data, shuffled and split between train and test sets
# (X_train, y_train), (X_test, y_test) = cifar10.load_data()
print('X_train shape:', X_train.shape)
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')
# convert class vectors to binary class matrices
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)
model = Sequential()
model.add(ZeroPadding2D((1,1),input_shape=(3,64,64)))
model.add(Convolution2D(64, 3, 3, activation='relu'))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(64, 3, 3, activation='relu',))
model.add(MaxPooling2D((2,2), strides=(2,2)))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(128, 3, 3, activation='relu'))#,weights=pretrained_weights['layer_6'].values()))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(128, 3, 3, activation='relu'))#,weights=pretrained_weights['layer_8'].values()))
model.add(MaxPooling2D((2,2), strides=(2,2)))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(256, 3, 3, activation='relu'))#,weights=pretrained_weights['layer_11'].values()))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(256, 3, 3, activation='relu'))#,weights=pretrained_weights['layer_13'].values()))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(256, 3, 3, activation='relu'))#,weights=pretrained_weights['layer_15'].values()))
model.add(MaxPooling2D((2,2), strides=(2,2)))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(512, 3, 3, activation='relu'))#,weights=pretrained_weights['layer_18'].values()))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(512, 3, 3, activation='relu'))#,weights=pretrained_weights['layer_20'].values()))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(512, 3, 3, activation='relu'))#,weights=pretrained_weights['layer_22'].values()))
model.add(MaxPooling2D((2,2), strides=(2,2)))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(512, 3, 3, activation='relu'))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(512, 3, 3, activation='relu'))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(512, 3, 3, activation='relu'))
model.add(MaxPooling2D((2,2), strides=(2,2)))
model.add(Flatten())
model.add(Dense(4096))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(4096))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(200, activation='softmax'))
# let's train the model using SGD + momentum (how original).
opt = SGD(lr=0.0001, decay=1e-6, momentum=0.7, nesterov=True)
# opt= Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
# opt = Adadelta(lr=1.0, rho=0.95, epsilon=1e-08, decay=0.0)
# opt = Adagrad(lr=0.01, epsilon=1e-08, decay=0.0)
model.compile(loss='categorical_crossentropy',
optimizer=opt,
metrics=['accuracy'])
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
print('Optimization....')
model.fit(X_train, Y_train,
batch_size=batch_size,
nb_epoch=nb_epoch,
validation_data=(X_test, Y_test),
shuffle=True)
# Save the resulting model
model.save('model.h5')
这种微小的Imagenet数据集由我与转换到djpeg PPM JPEG图像。 然后,我创建含有大量的二进制文件,对于每个图像,类别标签(1个字节),接着加入(64x64x3字节)。
阅读从Keras这个文件是速度奇慢。 所以(我很新的Python的,它听起来愚蠢到你),我决定初始化一个4D numpy的阵列(100000,3,64,64)(用于TH,(100000,64,64,3)为TF )与数据集和咸菜了。 现在只需〜40多岁的,当我运行上面的代码来加载该数据集在数组中。
我甚至检查了酸洗阵列包含在与下面的代码的正确顺序的数据:
import numpy as np
import cPickle as pickle
print("Reading data")
pix=pickle.load(open('xtrain_th.p','rb'))
print("Done")
img=67857
f=open('img'+str(img)+'.ppm','wb')
f.write('P6\n64 64\n255\n')
for y in range(0,64):
for x in range(0,64):
f.write(chr(pix[img][0][y][x]))
f.write(chr(pix[img][1][y][x]))
f.write(chr(pix[img][2][y][x]))
f.close()
这提取PPM图像从数据集回来。
最后,我注意到,训练数据集太排序(即第500张图片都属于0类,第二个500到1级,等等,等等)
所以,我拖着他们用下面的代码:
# Dataset preparation for Theano backend
import cPickle as pickle
import numpy as np
import random as rnd
n=100000
print('Load training data')
X_train=pickle.load(open('xtrain_th.p','rb')) # np.zeros((100000,3,64,64)).astype('uint8')
y_train=pickle.load(open('ytrain_th.p','rb')) # np.zeros((100000,1)).astype('uint8')
tmpa=np.zeros((3,64,64)).astype('uint8')
# Shuffle the data
print('Shuffling training data')
for _ in range(0,n):
i=rnd.randrange(n)
j=rnd.randrange(n)
tmpa=X_train[i]
X_train[i]=X_train[j];
X_train[j]=tmpa
tmp=y_train[i][0]
y_train[i][0]=y_train[j][0]
y_train[j][0]=tmp
print 'Pickle dump'
pickle.dump(X_train,open('xtrain_shu_th.p','wb'))
pickle.dump(y_train,open('ytrain_shu_th.p','wb'))
没有什么帮助。 我是不是在第一次尝试预计99%的准确率,但至少有一些动作,然后高原。
我想尝试TFLearn,但它有一个挂起的错误,当我前几天看了。
有任何想法吗 ? 提前致谢