Tensorflow custom activation function

I implemented a network with TensorFlow and created the model doing the following in my code:

def multilayer_perceptron(x, weights, biases):
    layer_1 = tf.add(tf.matmul(x, weights["h1"]), biases["b1"])
    layer_1 = tf.nn.relu(layer_1)
    out_layer = tf.add(tf.matmul(layer_1, weights["out"]), biases["out"])
    return out_layer

I initialize the weights and the biases doing:

weights = {
    "h": tf.Variable(tf.random_normal([n_input, n_hidden_1])),
    "out": tf.Variable(tf.random_normal([n_hidden_1, n_classes]))
    }

biases = {
    "b": tf.Variable(tf.random_normal([n_hidden_1])),
    "out": tf.Variable(tf.random_normal([n_classes]))
    }

Now I want to use a custom activation function. Therefore I replaced tf.nn.relu(layer_1) with a custom activation function custom_sigmoid(layer_1) which is defined as:

def custom_sigmoid(x):
    beta = tf.Variable(tf.random.normal(x.get_shape[1]))
    return tf.sigmoid(beta*x)

Where beta is a trainable parameter. I realized that this can not work since I don't know how to implement the derivative such that TensorFlow can use it.

Question: How can I use a custom activation function in TensorFlow? I would really appreciate any help.

标签： tensorflow activation-function

2条回答

时光不老，我们不散

2楼-- · 2019-08-18 00:11

I try to answer my own question. Here is what I did and what seems to work:

First I define a custom activation function:

def custom_sigmoid(x, beta_weights):
    return tf.sigmoid(beta_weights*x)

Then I create weights for the activation function:

beta_weights = {
    "beta1": tf.Variable(tf.random_normal([n_hidden_1]))
    }

Finally I add beta_weights to my model function and replace the activation function in multilayer_perceptron():

def multilayer_perceptron(x, weights, biases, beta_weights):
    layer_1 = tf.add(tf.matmul(x, weights["h1"]), biases["b1"])
    #layer_1 = tf.nn.relu(layer_1) # Old
    layer_1 = custom_sigmoid(x, beta_weights["beta1"]) # New
    out_layer = tf.add(tf.matmul(layer_1, weights["out"]), biases["out"])
    return out_layer

0人赞添加讨论(0) 举报

一纸荒年 Trace。

3楼-- · 2019-08-18 00:17

That's the beauty of automatic differentiation! You don't need to know how to compute the derivative of your function as long as you use all tensorflow constructs that are inherently differentiable (there are some functions that simply are non-differentiable functions in tensorflow).

For everything else the derivative is computed for you by tensorflow, any combination of operations that are inherently differentiable can be used and you never need to think about the gradient. Validate it by using tf.graidents in a test case to show that tensorflow is computing the gradient with respect to your cost function.

Here's a really nice explanation of automatic differentiation for the curious:

https://alexey.radul.name/ideas/2013/introduction-to-automatic-differentiation/

You can make sure that beta is a trainable parameter by checking that it exists in the collection tf.GraphKeys.TRAINABLE_VARIABLES, this means that the optimizer will compute its derivative w.r.t. the cost and update it (if it's not in that collection you should investigate).

0人赞添加讨论(0) 举报

Tensorflow custom activation function

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间