MXNet: nn.Activation vs nd.relu?

I am new to MXNet (I am using it in Python3)

Their tutorial series encourages you define your own gluon blocks.

So lets say this is your block (a common convolution structure):

class CNN1D(mx.gluon.Block):
    def __init__(self, **kwargs):
        super(CNN1D, self).__init__(**kwargs)
        with self.name_scope():
            self.cnn = mx.gluon.nn.Conv1D(10, 1)
            self.bn = mx.gluon.nn.BatchNorm()
            self.ramp = mx.gluon.nn.Activation(activation='relu')

    def forward(self, x):
        x = mx.nd.relu(self.cnn(x))
        x = mx.nd.relu(self.bn(x))
        x = mx.nd.relu(self.ramp(x))
        return x

This is mirror the structure of their example. What is the difference of mx.nd.relu vs mx.gluon.nn.Activation?

Should it be

x = self.ramp(x)

instead of

x = mx.nd.relu(self.ramp(x))

标签： python-3.x mxnet

2条回答

爷的心禁止访问

2楼-- · 2019-07-19 02:50

mx.gluon.nn.Activation wraps around mx.ndarray.Activation, see Gluon source code.

However, when using Gluon to build a neural net, it is recommended that you use the Gluon API and not branch off to use the lower level MXNet API arbitrarily - which may have issues as Gluon evolves and potentially change (e.g. stop using mx.nd under the hood).

0人赞添加讨论(0) 举报

我只想做你的唯一

3楼-- · 2019-07-19 03:03

It appears that

mx.gluon.nn.Activation(activation=<act>)

is a wrapper for calling a host of the underlying activations from the NDArray module.

Thus - in principle - it does not matter if in the forward definition one uses

x = self.ramp(x)

x = mx.nd.relu(x)

x = mx.nd.relu(self.ramp(x))

as relu is simply taking the max of 0 and the passed value (so multiple applications will not affect the value any more than a single call besides from a slight runtime duration increase).

Thus in this case it doesnt really matter. Of course with other activation functions stacking multiple calls might have an impact.

In MXNets documentation they use nd.relu in the forward definition when defining gluon.Blocks. This might carry slightly less overhead than using mx.gluon.nn.Activation(activation='relu').

Flavor-wise the gluon module is meant to be the high level abstraction. Therefore I am of the opinion that when defining a block one should use ramp = mx.gluon.nn.Activation(activation=<act>) instead of nd.<act>(x) and then call self.ramp(x) in the forward definition.

However given that at this point all custom Block tutorials / documentation stick to relu activation, whether or not this will have lasting consequences is yet to be seen.

All together the use of mx.gluon.nn.Activation seems to be a way to call activation functions from the NDArray module from the Gluon module.

0人赞添加讨论(0) 举报

MXNet: nn.Activation vs nd.relu?

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间