Tensorflow (tf-slim) Model with is_training True a

2019-07-11 03:32发布

I would like to run a given model both on the train set (is_training=True) and on the validation set (is_training=False), specifically with how dropout is applied. Right now the prebuilt models expose a parameter is_training that is passed it the dropout layer when building the network. The issue is that If I call the method twice with different values of is_training, I will get two different networks that do no share weights (I think?). How do I go about getting the two networks to share the same weights such that I can run the network that I have trained on the validation set?

3条回答
狗以群分
2楼-- · 2019-07-11 03:44

I wrote a solution with your comment to use Overfeat in train and test mode. (I couldn't test it so you can check if it works?)

First some imports and parameters:

import tensorflow as tf
slim = tf.contrib.slim
overfeat = tf.contrib.slim.nets.overfeat

batch_size = 32
inputs = tf.placeholder(tf.float32, [batch_size, 231, 231, 3])
dropout_keep_prob = 0.5
num_classes = 1000

In train mode, we pass a normal scope to the function overfeat:

scope = 'overfeat'
is_training = True

output = overfeat.overfeat(inputs, num_classes, is_training,         
                           dropout_keep_prob, scope=scope)

Then in test mode, we create the same scope but with reuse=True.

scope = tf.VariableScope(reuse=True, name='overfeat')
is_training = False

output = overfeat.overfeat(inputs, num_classes, is_training,         
                           dropout_keep_prob, scope=scope)
查看更多
看我几分像从前
3楼-- · 2019-07-11 04:00

you can just use a placeholder for is_training:

isTraining = tf.placeholder(tf.bool)

# create nn
net = ...
net = slim.dropout(net,
                   keep_prob=0.5,
                   is_training=isTraining)
net = ...

# training
sess.run([net], feed_dict={isTraining: True})

# testing
sess.run([net], feed_dict={isTraining: False})
查看更多
我想做一个坏孩纸
4楼-- · 2019-07-11 04:01

It depends on the case, the solutions are different.

My first option would be to use a different process to do the evaluation. You only need to check that there is a new checkpoint and load that weights into the evaluation network (with is_training=False):

checkpoint = tf.train.latest_checkpoint(self.checkpoints_path)
# wait until a new check point is available
while self.lastest_checkpoint == checkpoint:
    time.sleep(30)  # sleep 30 seconds waiting for a new checkpoint
    checkpoint = tf.train.latest_checkpoint(self.checkpoints_path)
logging.info('Restoring model from {}'.format(checkpoint))
self.saver.restore(session, checkpoint)
self.lastest_checkpoint = checkpoint

The second option is after every epoch you unload the graph and create a new evaluation graph. This solution waste a lot of time loading and unloading graphs.

The third option is to share the weights. But feeding these networks with queues or dataset can lead to issues, so you have to be very careful. I only use this for Siamese networks.

with tf.variable_scope('the_scope') as scope:
    your_model(is_training=True)
    scope.reuse_variables()
    your_model(is_training=False)
查看更多
登录 后发表回答