I have written python code to programmatically generate a convolutional neural network (CNN) for training and validation .prototxt files in caffe. Below is my function:
def custom_net(lmdb, batch_size):
# define your own net!
n = caffe.NetSpec()
# keep this data layer for all networks
n.data, n.label = L.Data(batch_size=batch_size, backend=P.Data.LMDB, source=lmdb,
ntop=2, transform_param=dict(scale=1. / 255))
n.conv1 = L.Convolution(n.data, kernel_size=6,
num_output=48, weight_filler=dict(type='xavier'))
n.pool1 = L.Pooling(n.conv1, kernel_size=2, stride=2, pool=P.Pooling.MAX)
n.conv2 = L.Convolution(n.pool1, kernel_size=5,
num_output=48, weight_filler=dict(type='xavier'))
n.pool2 = L.Pooling(n.conv2, kernel_size=2, stride=2, pool=P.Pooling.MAX)
n.conv3 = L.Convolution(n.pool2, kernel_size=4,
num_output=48, weight_filler=dict(type='xavier'))
n.pool3 = L.Pooling(n.conv3, kernel_size=2, stride=2, pool=P.Pooling.MAX)
n.conv4 = L.Convolution(n.pool3, kernel_size=2,
num_output=48, weight_filler=dict(type='xavier'))
n.pool4 = L.Pooling(n.conv4, kernel_size=2, stride=2, pool=P.Pooling.MAX)
n.fc1 = L.InnerProduct(n.pool4, num_output=50,
weight_filler=dict(type='xavier'))
n.drop1 = L.Dropout(n.fc1, dropout_param=dict(dropout_ratio=0.5))
n.score = L.InnerProduct(n.drop1, num_output=2,
weight_filler=dict(type='xavier'))
# keep this loss layer for all networks
n.loss = L.SoftmaxWithLoss(n.score, n.label)
return n.to_proto()
with open('net_train.prototxt', 'w') as f:
f.write(str(custom_net(train_lmdb_path, train_batch_size)))
with open('net_test.prototxt', 'w') as f:
f.write(str(custom_net(test_lmdb_path, test_batch_size)))
Is there a way to similarly generate deploy.prototxt for testing on unseen data that is not in an lmdb file? If so, i would really appreciate it if someone can point me to a reference.
Quite simply:
Now call the function:
As you can see there are two modifications to the prototxt (conditioned on
lmdb
beingNone
):The first, instead of
"Data"
layer, you have the declarative"Input"
layer declaring only"data"
and no"label"
.The second change is the output layer: instead of a loss layer, you have a prediction layer (see, e.g., this answer).