According to Andrej's blog -
Where he says that for a Convolutional Layer, with parameter sharing, it introduces F x F x D
weights per filter, for a total of (F x F x D) x K
weights and K
biases.
In my tensorflow code, I have an architecture like this (where D=1)
conv1 : F = 3, K = 32, S = 1, P = 1.
pool1 :
conv2 and so on...
According to the formula,
A model generated with F=3 for conv1 should have 9K weights ,i.e. smaller model, and
A model generated with F=5 should have 25K weights i.e. bigger model
Question
In my code, when I write out the model files for both these cases, I see that the .ckpt
file is about 380MB (F=3) and 340MB (F=5). Am I missing something?
Code: Here's the reference code for saving the variables to a model and printing its size.
''' Run the session and save the model'''
#Add a saver here
saver = tf.train.Saver()
# Run session
sess.run(tf.initialize_all_variables())
for i in range(201):
batch = mnist.train.next_batch(50)
if i%100 == 0:
train_accuracy = accuracy.eval(feed_dict={x:batch[0], y_: batch[1], keep_prob: 1.0})
print("step %d, training accuracy %g"%(i, train_accuracy))
train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 1.0})
# Save model
save_path = saver.save(sess, "/Users/voladoddi/Desktop/dropmodel.ckpt")
print("Model saved in file: %s" % save_path)
# Test
print("test accuracy %g"%accuracy.eval(feed_dict={x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))
# Print model size.
vars = 0
for v in tf.all_variables():
vars += np.prod(v.get_shape().as_list())
print(vars*4)/(1024**2),"MB"