Is there anyway to use Rectified Linear Unit (ReLU) as the activation function of the hidden layer instead of tanh()
or sigmoid()
in Theano? The implementation of the hidden layer is as follows and as far as I have searched on the internet ReLU is not implemented inside the Theano.
class HiddenLayer(object):
def __init__(self, rng, input, n_in, n_out, W=None, b=None, activation=T.tanh):
pass
UPDATE: Latest version of theano has native support of ReLU: T.nnet.relu, which should be preferred over custom solutions.
I decided to compare the speed of solutions, since it is very important for NNs. Compared speed of function itself and it's gradient, in first case
switch
is preferred, the gradient is faster for x * (x>0). All the computed gradients are correct.Finally, let's compare to how gradient should be computed (the fastest way)
So theano generates inoptimal code for gradient. IMHO, switch version today should be preferred.
The function is very simple in Python:
I think it is more precise to write it in this way:
relu is easy to do in Theano:
To use it in your case make a python function that will implement relu and pass it to activation:
Some people use this implementation:
x * (x > 0)
UPDATE: Newer Theano version have theano.tensor.nnet.relu(x) available.
I wrote it like this:
or: