Can we use different Activation functions for hidden layer and output layer of Neural network? Is there any explicit advantage of using such scheme?
相关问题
- How to perform element-wise custom function with t
- How to conditionally scale values in Keras Lambda
- neural network does not learn (loss stays the same
- Trying to understand Pytorch's implementation
- Convolutional Neural Network seems to be randomly
相关文章
- how to flatten input in `nn.Sequential` in Pytorch
- What are the problems associated to Best First Sea
- Which is the best way to multiply a large and spar
- How to use cross_val_score with random_state
- Looping through training data in Neural Networks B
- Why does this Keras model require over 6GB of memo
- How to measure overfitting when train and validati
- McNemar's test in Python and comparison of cla
For the last layer of the network the activation unit also depends on the task.
For intermediate layers, now a days Relu is used by most of the people because of its faster to calculate and it won't vanishes early in back-propogation.
In short - yes you can. It is a common approach to use sigmoid function as a hidden layer activation to ensure nonlinear features, and activation in the output selected for a particular task (depending on what you are trying to model, and what cost function you use).
If you are implementing the prediction task instead of classification, you may use linear combination in the output layer since the sigmoid function restrains your output range to (0,1), which is often applied in threshold-based classification problems.