Theano HiddenLayer Activation Function

Is there anyway to use Rectified Linear Unit (ReLU) as the activation function of the hidden layer instead of tanh() or sigmoid() in Theano? The implementation of the hidden layer is as follows and as far as I have searched on the internet ReLU is not implemented inside the Theano.

class HiddenLayer(object):
  def __init__(self, rng, input, n_in, n_out, W=None, b=None, activation=T.tanh):
    pass

标签： python machine-learning neural-network theano

5条回答

三岁会撩人

2楼-- · 2019-04-20 05:21

UPDATE: Latest version of theano has native support of ReLU: T.nnet.relu, which should be preferred over custom solutions.

I decided to compare the speed of solutions, since it is very important for NNs. Compared speed of function itself and it's gradient, in first case switch is preferred, the gradient is faster for x * (x>0). All the computed gradients are correct.

def relu1(x):
    return T.switch(x<0, 0, x)

def relu2(x):
    return T.maximum(x, 0)

def relu3(x):
    return x * (x > 0)


z = numpy.random.normal(size=[1000, 1000])
for f in [relu1, relu2, relu3]:
    x = theano.tensor.matrix()
    fun = theano.function([x], f(x))
    %timeit fun(z)
    assert numpy.all(fun(z) == numpy.where(z > 0, z, 0))

Output: (time to compute ReLU function)
>100 loops, best of 3: 3.09 ms per loop
>100 loops, best of 3: 8.47 ms per loop
>100 loops, best of 3: 7.87 ms per loop

for f in [relu1, relu2, relu3]:
    x = theano.tensor.matrix()
    fun = theano.function([x], theano.grad(T.sum(f(x)), x))
    %timeit fun(z)
    assert numpy.all(fun(z) == (z > 0)

Output: time to compute gradient 
>100 loops, best of 3: 8.3 ms per loop
>100 loops, best of 3: 7.46 ms per loop
>100 loops, best of 3: 5.74 ms per loop

Finally, let's compare to how gradient should be computed (the fastest way)

x = theano.tensor.matrix()
fun = theano.function([x], x > 0)
%timeit fun(z)
Output:
>100 loops, best of 3: 2.77 ms per loop

So theano generates inoptimal code for gradient. IMHO, switch version today should be preferred.

0人赞添加讨论(0) 举报

【Aperson】

3楼-- · 2019-04-20 05:24

The function is very simple in Python:

def relu(input):
    output = max(input, 0)
    return(output)

0人赞添加讨论(0) 举报

ゆ、 Hurt°

4楼-- · 2019-04-20 05:25

I think it is more precise to write it in this way:

x * (x > 0.) + 0. * (x < 0.)

0人赞添加讨论(0) 举报

▲ chillily

5楼-- · 2019-04-20 05:35

relu is easy to do in Theano:

switch(x<0, 0, x)

To use it in your case make a python function that will implement relu and pass it to activation:

def relu(x):
    return theano.tensor.switch(x<0, 0, x)
HiddenLayer(..., activation=relu)

Some people use this implementation: x * (x > 0)

UPDATE: Newer Theano version have theano.tensor.nnet.relu(x) available.

0人赞添加讨论(0) 举报

再贱就再见

6楼-- · 2019-04-20 05:40

I wrote it like this:

lambda x: T.maximum(0,x)

or:

lambda x: x * (x > 0)

0人赞添加讨论(0) 举报

Theano HiddenLayer Activation Function

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间