I have a regular keras model called e
and I would like to compare its output for both y_pred
and y_true
in my custom loss function.
from keras import backend as K
def custom_loss(y_true, y_pred):
return K.mean(K.square(e.predict(y_pred)-e.predict(y_true)), axis=-1)
I am getting the error: AttributeError: 'Tensor' object has no attribute 'ndim'
This is because y_true
and y_pred
are both tensor object and keras.model.predict
expects to be passed a numpy.array
.
Any idea how I may succeed in using my keras.model
in my custom loss function?
I am open to getting the output of a specified layer if need be or to converting my keras.model
to a tf.estimator
object (or anything else).
First, let's try to understand the error message you're getting:
AttributeError: 'Tensor' object has no attribute 'ndim'
Let's take a look at the Keras documentation and find the predict method of Keras model. We can see the description of the function parameters:
x: the input data, as a Numpy array.
So, the model is trying to get a ndims
property of a numpy array
, because it expects an array as input. On other hand, the custom loss function of the Keras framework gets tensors
as inputs. So, don't write any python code inside it - it will never be executed during evaluation. This function is just called to construct the computational graph.
Okay, now that we found out the meaning behind that error message, how can we use a Keras model inside custom loss function? Simple! We just need to get the evaluation graph of the model. Try something like this:
def custom_loss(y_true, y_pred):
# Your model exists in global scope
global e
# Get the layers of your model
layers = [l for l in e.layers]
# Construct a graph to evaluate your other model on y_pred
eval_pred = y_pred
for i in range(len(layers)):
eval_pred = layers[i](eval_pred)
# Construct a graph to evaluate your other model on y_true
eval_true = y_true
for i in range(len(layers)):
eval_true = layers[i](eval_true)
# Now do what you wanted to do with outputs.
# Note that we are not returning the values, but a tensor.
return K.mean(K.square(eval_pred - eval_true), axis=-1)
Please note that the code above is not tested. However, the general idea will stay the same regardless of the implementation: you need to construct a graph, in which the y_true
and y_pred
will flow through it to the final operations.