In Tensorflow, do I need to add new op for “sinc”

2019-05-31 16:11发布

I am playing with Tensorflow and want to train the weights in such a way to make the neuron to "fire" only when the neuron outputs a value within a certain range, and just output 0 or close to 0 when the output value falls outside that range.

I'm thinking of doing that by using the "Sinc" (here) or "Gaussian" (here) as the activation functions. Unfortunately Tensorflow does not support this.

Do I need to add a new op for that? Tensorflow does support all the operations necessary to implement "Sinc" or "Gaussian" so they also should have gradient implemented for training.

I have tried using this but somehow all the weights and biases of the neural network go to 0.

3条回答
该账号已被封号
2楼-- · 2019-05-31 16:32

Adding to Laine Mikael's answer above, I found this particular sinc implementation to give nan during the backward pass. Here's an alternative based on how it is implemented in numpy:

def sinc(x):
    x = tf.where(tf.abs(x) < 1e-20, 1e-20 * tf.ones_like(x), x)
    return tf.sin(x) / x
查看更多
Bombasti
3楼-- · 2019-05-31 16:42

sinc function that can be e.g. passed as an activation function. A bit messy but works.

def sinc(x):
    atzero = tf.ones_like(x)
    atother = tf.divide(tf.sin(x),x)
    value = tf.where(tf.equal(x,0), atzero, atother )
    return value

Gaussian:

def gaussian(x):
    sq = tf.square(x)
    neg = tf.negative(sq)
    return tf.exp(neg)
查看更多
Explosion°爆炸
4楼-- · 2019-05-31 16:48

You can implement both these function using basic TF ops. I do not recommend using periodic activation functions (or "quasi periodic" - in general functions which have a changing sign of a derivative) in neural networks from mathematical perspective (enormous amount of shallow local optimas), thus I would recommend not to use sinc. In terms of gaussians, you might have to take a good care of initialization. The tricky thing about this kind of "local functions" is they go to 0 very quickly, thus you have to make sure that initialy your neuron activations are in the "active" part when presented with training data. It is way easier with dot-product based methods (like sigmoid, relu etc.) as all you have to do is deal with the scale. For gaussians you actually have to make sure that your activations are "in place".

查看更多
登录 后发表回答