The Problem
I have a Python script that uses TensorFlow to create a multilayer perceptron net (with dropout) in order to do binary classification. Even though I've been careful to set both the Python and TensorFlow seeds, I get non-repeatable results. If I run once and then run again, I get different results. I can even run once, quit Python, restart Python, run again and get different results.
What I've Tried
I know some people posted questions about getting non-repeatable results in TensorFlow (e.g., "How to get stable results...", "set_random_seed not working...", "How to get reproducible result in TensorFlow"), and the answers usually turn out to be an incorrect use/understanding of tf.set_random_seed()
. I've made sure to implement the solutions given but that has not solved my problem.
A common mistake is not realizing that tf.set_random_seed()
is only a graph-level seed and that running the script multiple times will alter the graph, explaining the non-repeatable results. I used the following statement to print out the entire graph and verified (via diff) that the graph is the same even when the results are different.
print [n.name for n in tf.get_default_graph().as_graph_def().node]
I've also used function calls like tf.reset_default_graph()
and tf.get_default_graph().finalize()
to avoid any changes to the graph even though this is probably overkill.
The (Relevant) Code
My script is ~360 lines long so here are the relevant lines (with snipped code indicated). Any items that are in ALL_CAPS are constants that are defined in my Parameters
block below.
import numpy as np
import tensorflow as tf
from copy import deepcopy
from tqdm import tqdm # Progress bar
# --------------------------------- Parameters ---------------------------------
(snip)
# --------------------------------- Functions ---------------------------------
(snip)
# ------------------------------ Obtain Train Data -----------------------------
(snip)
# ------------------------------ Obtain Test Data -----------------------------
(snip)
random.seed(12345)
tf.set_random_seed(12345)
(snip)
# ------------------------- Build the TensorFlow Graph -------------------------
tf.reset_default_graph()
with tf.Graph().as_default():
x = tf.placeholder("float", shape=[None, N_INPUT])
y_ = tf.placeholder("float", shape=[None, N_CLASSES])
# Store layers weight & bias
weights = {
'h1': tf.Variable(tf.random_normal([N_INPUT, N_HIDDEN_1])),
'h2': tf.Variable(tf.random_normal([N_HIDDEN_1, N_HIDDEN_2])),
'h3': tf.Variable(tf.random_normal([N_HIDDEN_2, N_HIDDEN_3])),
'out': tf.Variable(tf.random_normal([N_HIDDEN_3, N_CLASSES]))
}
biases = {
'b1': tf.Variable(tf.random_normal([N_HIDDEN_1])),
'b2': tf.Variable(tf.random_normal([N_HIDDEN_2])),
'b3': tf.Variable(tf.random_normal([N_HIDDEN_3])),
'out': tf.Variable(tf.random_normal([N_CLASSES]))
}
# Construct model
pred = multilayer_perceptron(x, weights, biases, USE_DROP_LAYERS, DROP_KEEP_PROB)
mean1 = tf.reduce_mean(weights['h1'])
mean2 = tf.reduce_mean(weights['h2'])
mean3 = tf.reduce_mean(weights['h3'])
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(pred, y_))
regularizers = (tf.nn.l2_loss(weights['h1']) + tf.nn.l2_loss(biases['b1']) +
tf.nn.l2_loss(weights['h2']) + tf.nn.l2_loss(biases['b2']) +
tf.nn.l2_loss(weights['h3']) + tf.nn.l2_loss(biases['b3']))
cost += COEFF_REGULAR * regularizers
optimizer = tf.train.GradientDescentOptimizer(LEARNING_RATE).minimize(cost)
out_labels = tf.nn.softmax(pred)
sess = tf.InteractiveSession()
sess.run(tf.initialize_all_variables())
tf.get_default_graph().finalize() # Lock the graph as read-only
#Print the default graph in text form
print [n.name for n in tf.get_default_graph().as_graph_def().node]
# --------------------------------- Training ----------------------------------
print "Start Training"
pbar = tqdm(total = TRAINING_EPOCHS)
for epoch in range(TRAINING_EPOCHS):
avg_cost = 0.0
batch_iter = 0
train_outfile.write(str(epoch))
while batch_iter < BATCH_SIZE:
train_features = []
train_labels = []
batch_segments = random.sample(train_segments, 20)
for segment in batch_segments:
train_features.append(segment[0])
train_labels.append(segment[1])
sess.run(optimizer, feed_dict={x: train_features, y_: train_labels})
line_out = "," + str(batch_iter) + "\n"
train_outfile.write(line_out)
line_out = ",," + str(sess.run(mean1, feed_dict={x: train_features, y_: train_labels}))
line_out += "," + str(sess.run(mean2, feed_dict={x: train_features, y_: train_labels}))
line_out += "," + str(sess.run(mean3, feed_dict={x: train_features, y_: train_labels})) + "\n"
train_outfile.write(line_out)
avg_cost += sess.run(cost, feed_dict={x: train_features, y_: train_labels})/BATCH_SIZE
batch_iter += 1
line_out = ",,,,," + str(avg_cost) + "\n"
train_outfile.write(line_out)
pbar.update(1) # Increment the progress bar by one
train_outfile.close()
print "Completed training"
# ------------------------------ Testing & Output ------------------------------
keep_prob = 1.0 # Do not use dropout when testing
print "now reducing mean"
print(sess.run(mean1, feed_dict={x: test_features, y_: test_labels}))
print "TRUE LABELS"
print(test_labels)
print "PREDICTED LABELS"
pred_labels = sess.run(out_labels, feed_dict={x: test_features})
print(pred_labels)
output_accuracy_results(pred_labels, test_labels)
sess.close()
What's not repeatable
As you can see, I'm outputting results during each epoch to a file and also printing out accuracy numbers at the end. None of these match from run to run, even though I believe I've set the seed(s) correctly. I've used both random.seed(12345)
and tf.set_random_seed(12345)
Please let me know if I need to provide more information. And thanks in advance for any help.
-DG
Set-up details
TensorFlow version 0.8.0 (CPU only)
Enthought Canopy version 1.7.2 (Python 2.7, not 3.+)
Mac OS X version 10.11.3
You need to set operation level seed in addition to graph-level seed, ie
What I did to get reproducible results training and testing a hug deep network using tensorflow.
SEED
is what is used)Here is a few of those functions:
tf.nn.dropout
,tf.contrib.layers.xavier_initializer
, etc.Note: This step might seem unreasonable because we are already using
tf.set_random_seed
to set a seed for tensorflow, but trust me, you need this! See Yaroslav's answer.See this tensorflow github issue. Some operations on the GPU are not fully deterministic (speed vs precision).
I also observed that for the seed to have any effect,
tf.set_random_seed(...)
must be called before theSession
is created. And also you should either completely restart the python interpreter every time you run your code, or calltf.reset_default_graph()
at the start.In TensorFlow 2.0
tf.set_random_seed(42)
has changed totf.random.set_seed(42)
.https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/random/set_seed
That should be the only seed necessary if just using TensorFlow.
Just to add to Yaroslav's answer, you should also set numpy seed in addition to operation and graph level seeds, as some backend operations depend on numpy. This did the trick for me
np.random.seed()
with Tensorflow V 1.1.0