Accessing gradient values of keras model outputs w

2020-02-28 19:23发布

问题:

I made a pretty simple NN model to do some non-linear regressions for me in Keras, as an introduction exercise. I uploaded my jupyter notebookit as a gist here (renders properly on github), which is pretty short and to the point.

It just fits the 1D function y = (x - 5)^2 / 25.

I know that Theano and Tensorflow are, at their core, graph based derivative (gradient) passing frameworks. And utilizing the gradients of loss functions with respect to weights for gradient step-based optimization are the main purpose of that.

But what I'm trying to get sense of is if I have access to something that, given a trained model, can approximate derivatives of inputs with respect to the output layer for me (not the weights or loss function). So for this case, I would want y' = 2(x-5)/25.0 estimated via the network's derivative graph for me for an indicated value of the input x, in the network's currently trained state.

Do I have any options in either the Keras or Theano/TF backend APIs to do this, or do I need to do my own chain ruling somehow with the weights (or maybe adding my own non-trainable "identity" layers or something)? In my notebook, you can see me trying a few approaches based what I was able to find so far, but without a ton of success.

To make it concrete, I have a working keras model with the structure:

model = Sequential()
# 1d input
model.add(Dense(64, input_dim=1, activation='relu'))
model.add(Activation("linear"))
model.add(Dense(32, activation='relu'))
model.add(Activation("linear"))
model.add(Dense(32, activation='relu'))
# 1d output
model.add(Dense(1))

model.compile(loss='mse', optimizer='adam', metrics=["accuracy"])
model.fit(x, y,
      batch_size=10,
      epochs=25,
      verbose=0,
      validation_data=(x_test, y_test))

I would like to estimate the derivative of output y with respect to input x at, say, x = 0.5.

All of my attempts to extract gradient values based on searching for past answers have led to syntax errors. From a high level point of view, is this a supported feature of Keras, or is any solution going to be backend-specific?

回答1:

As you mention, Theano and TF are symbolic, so doing a derivative should be quite easy:

import theano
import theano.tensor as T
import keras.backend as K
J = T.grad(model.output[0, 0], model.input)
jacobian = K.function([model.input, K.learning_phase()], [J])

First you compute the symbolic gradient (T.grad) of the output given the input, then you build a function that you can call and does the computation. Note that sometimes this is not that trivial due to shape problems, as you get one derivative for each element in the input.