Activation Function choice for Neural network

2019-07-24 20:42发布

Can we use different Activation functions for hidden layer and output layer of Neural network? Is there any explicit advantage of using such scheme?

标签： machine-learning artificial-intelligence neural-network linear-algebra

3条回答

forever°为你锁心

2楼-- · 2019-07-24 20:57

For the last layer of the network the activation unit also depends on the task.

Classification: You will want only one of your outputs to be one of the labels, but there's no differentiable way to achieve precisely that, so you will want to use a softmax to approximate it.
Regression: You will want to use the sigmoid or tanh activation, because you want the result to be linear. With use of sigmoid and tanh, the output will be scaled between 0 and 1. So will be easy to optimize.

For intermediate layers, now a days Relu is used by most of the people because of its faster to calculate and it won't vanishes early in back-propogation.

0人赞添加讨论(0) 举报

爷、活的狠高调

3楼-- · 2019-07-24 21:00

In short - yes you can. It is a common approach to use sigmoid function as a hidden layer activation to ensure nonlinear features, and activation in the output selected for a particular task (depending on what you are trying to model, and what cost function you use).

0人赞添加讨论(0) 举报

唯我独甜

4楼-- · 2019-07-24 21:01

If you are implementing the prediction task instead of classification, you may use linear combination in the output layer since the sigmoid function restrains your output range to (0,1), which is often applied in threshold-based classification problems.

0人赞添加讨论(0) 举报

Activation Function choice for Neural network

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间