I am experimenting some code on Jupyter and keep getting stuck here. Things work actually fine if I remove the line starting with "optimizer = ..." and all references to this line. But if I put this line in the code, it gives an error.
I am not pasting all other functions here to keep the size of the code at a readable level. I hope someone more experienced can see it at once what is the problem here.
Note that there are 5, 4, 3, and 2 units in input layer, in 2 hidden layers, and in output layers.
CODE:
tf.reset_default_graph()
num_units_in_layers = [5,4,3,2]
X = tf.placeholder(shape=[5, 3], dtype=tf.float32)
Y = tf.placeholder(shape=[2, 3], dtype=tf.float32)
parameters = initialize_layer_parameters(num_units_in_layers)
init = tf.global_variables_initializer()
my_sess = tf.Session()
my_sess.run(init)
ZL = forward_propagation_with_relu(X, num_units_in_layers, parameters, my_sess)
#my_sess.run(parameters) # Do I need to run this? Or is it obsolete?
cost = compute_cost(ZL, Y, my_sess, parameters, batch_size=3, lambd=0.05)
optimizer = tf.train.AdamOptimizer(learning_rate = 0.001).minimize(cost)
_ , minibatch_cost = my_sess.run([optimizer, cost],
feed_dict={X: minibatch_X,
Y: minibatch_Y})
print(minibatch_cost)
my_sess.close()
ERROR:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-321-135b9fc18268> in <module>()
16 cost = compute_cost(ZL, Y, my_sess, parameters, 3, 0.05)
17
---> 18 optimizer = tf.train.AdamOptimizer(learning_rate = 0.001).minimize(cost)
19 _ , minibatch_cost = my_sess.run([optimizer, cost],
20 feed_dict={X: minibatch_X,
~/.local/lib/python3.5/site-packages/tensorflow/python/training/optimizer.py in minimize(self, loss, global_step, var_list, gate_gradients, aggregation_method, colocate_gradients_with_ops, name, grad_loss)
362 "No gradients provided for any variable, check your graph for ops"
363 " that do not support gradients, between variables %s and loss %s." %
--> 364 ([str(v) for _, v in grads_and_vars], loss))
365
366 return self.apply_gradients(grads_and_vars, global_step=global_step,
ValueError: No gradients provided for any variable, check your graph for ops that do not support gradients, between variables ["<tf.Variable 'weights/W1:0' shape=(4, 5) dtype=float32_ref>", "<tf.Variable 'biases/b1:0' shape=(4, 1) dtype=float32_ref>", "<tf.Variable 'weights/W2:0' shape=(3, 4) dtype=float32_ref>", "<tf.Variable 'biases/b2:0' shape=(3, 1) dtype=float32_ref>", "<tf.Variable 'weights/W3:0' shape=(2, 3) dtype=float32_ref>", "<tf.Variable 'biases/b3:0' shape=(2, 1) dtype=float32_ref>"] and loss Tensor("Add_3:0", shape=(), dtype=float32).
Note that if I run
print(tf.trainable_variables())
just before the "optimizer = ..." line, I actually see my trainable variables there.
hts/W1:0' shape=(4, 5) dtype=float32_ref>, <tf.Variable 'biases/b1:0' shape=(4, 1) dtype=float32_ref>, <tf.Variable 'weights/W2:0' shape=(3, 4) dtype=float32_ref>, <tf.Variable 'biases/b2:0' shape=(3, 1) dtype=float32_ref>, <tf.Variable 'weights/W3:0' shape=(2, 3) dtype=float32_ref>, <tf.Variable 'biases/b3:0' shape=(2, 1) dtype=float32_ref>]
Would anyone have an idea about what can be the problem?
EDITING and ADDING SOME MORE INFO: In case you would like to see how I create & initialize my parameters, here is the code. Maybe there is sth wrong with this part but I don't see what..
def get_nn_parameter(variable_scope, variable_name, dim1, dim2):
with tf.variable_scope(variable_scope, reuse=tf.AUTO_REUSE):
v = tf.get_variable(variable_name,
[dim1, dim2],
trainable=True,
initializer = tf.contrib.layers.xavier_initializer())
return v
def initialize_layer_parameters(num_units_in_layers):
parameters = {}
L = len(num_units_in_layers)
for i in range (1, L):
temp_weight = get_nn_parameter("weights",
"W"+str(i),
num_units_in_layers[i],
num_units_in_layers[i-1])
parameters.update({"W" + str(i) : temp_weight})
temp_bias = get_nn_parameter("biases",
"b"+str(i),
num_units_in_layers[i],
1)
parameters.update({"b" + str(i) : temp_bias})
return parameters
#
ADDENDUM
I got it working. Instead of writing a separate answer, I am adding the correct version of my code here.
(David's answer below helped a lot.)
I simply removed the my_sess as parameter to my compute_cost function. (I could not make it work previously but seemingly it is not needed at all.) And I also reordered statements in my main function to call things in the right order.
Here is the working version of my cost function and how I call it:
def compute_cost(ZL, Y, parameters, mb_size, lambd):
logits = tf.transpose(ZL)
labels = tf.transpose(Y)
cost_unregularized = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits = logits, labels = labels))
#Since the dict parameters includes both W and b, it needs to be divided with 2 to find L
L = len(parameters) // 2
list_sum_weights = []
for i in range (0, L):
list_sum_weights.append(tf.nn.l2_loss(parameters.get("W"+str(i+1))))
regularization_effect = tf.multiply((lambd / mb_size), tf.add_n(list_sum_weights))
cost = tf.add(cost_unregularized, regularization_effect)
return cost
And here is the main function where I call the compute_cost(..) function:
tf.reset_default_graph()
num_units_in_layers = [5,4,3,2]
X = tf.placeholder(shape=[5, 3], dtype=tf.float32)
Y = tf.placeholder(shape=[2, 3], dtype=tf.float32)
parameters = initialize_layer_parameters(num_units_in_layers)
my_sess = tf.Session()
ZL = forward_propagation_with_relu(X, num_units_in_layers, parameters)
cost = compute_cost(ZL, Y, parameters, 3, 0.05)
optimizer = tf.train.AdamOptimizer(learning_rate = 0.001).minimize(cost)
init = tf.global_variables_initializer()
my_sess.run(init)
_ , minibatch_cost = my_sess.run([optimizer, cost],
feed_dict={X: [[-1.,4.,-7.],[2.,6.,2.],[3.,3.,9.],[8.,4.,4.],[5.,3.,5.]],
Y: [[0.6, 0., 0.3], [0.4, 0., 0.7]]})
print(minibatch_cost)
my_sess.close()