I would like to run a given model both on the train set (is_training=True
) and on the validation set (is_training=False
), specifically with how dropout
is applied. Right now the prebuilt models expose a parameter is_training
that is passed it the dropout
layer when building the network. The issue is that If I call the method twice with different values of is_training
, I will get two different networks that do no share weights (I think?). How do I go about getting the two networks to share the same weights such that I can run the network that I have trained on the validation set?
问题:
回答1:
I wrote a solution with your comment to use Overfeat in train and test mode. (I couldn't test it so you can check if it works?)
First some imports and parameters:
import tensorflow as tf
slim = tf.contrib.slim
overfeat = tf.contrib.slim.nets.overfeat
batch_size = 32
inputs = tf.placeholder(tf.float32, [batch_size, 231, 231, 3])
dropout_keep_prob = 0.5
num_classes = 1000
In train mode, we pass a normal scope to the function overfeat
:
scope = 'overfeat'
is_training = True
output = overfeat.overfeat(inputs, num_classes, is_training,
dropout_keep_prob, scope=scope)
Then in test mode, we create the same scope but with reuse=True
.
scope = tf.VariableScope(reuse=True, name='overfeat')
is_training = False
output = overfeat.overfeat(inputs, num_classes, is_training,
dropout_keep_prob, scope=scope)
回答2:
you can just use a placeholder for is_training:
isTraining = tf.placeholder(tf.bool)
# create nn
net = ...
net = slim.dropout(net,
keep_prob=0.5,
is_training=isTraining)
net = ...
# training
sess.run([net], feed_dict={isTraining: True})
# testing
sess.run([net], feed_dict={isTraining: False})
回答3:
It depends on the case, the solutions are different.
My first option would be to use a different process to do the evaluation. You only need to check that there is a new checkpoint and load that weights into the evaluation network (with is_training=False
):
checkpoint = tf.train.latest_checkpoint(self.checkpoints_path)
# wait until a new check point is available
while self.lastest_checkpoint == checkpoint:
time.sleep(30) # sleep 30 seconds waiting for a new checkpoint
checkpoint = tf.train.latest_checkpoint(self.checkpoints_path)
logging.info('Restoring model from {}'.format(checkpoint))
self.saver.restore(session, checkpoint)
self.lastest_checkpoint = checkpoint
The second option is after every epoch you unload the graph and create a new evaluation graph. This solution waste a lot of time loading and unloading graphs.
The third option is to share the weights. But feeding these networks with queues or dataset can lead to issues, so you have to be very careful. I only use this for Siamese networks.
with tf.variable_scope('the_scope') as scope:
your_model(is_training=True)
scope.reuse_variables()
your_model(is_training=False)