I am trying to implement a simple binary classification problem using RNN LSTM and still not available to figure out the correct loss function for the network. The issue is, when I use the cross_binary_entophy as loss function, the loss value for training and testing is relatively high as compared to using a mean_square_error function.
Upon research, I came across to justifications that binary cross entropy should be used for classification problem and MSE for the regression problem. However, in my case, I am getting better accuracies and lesser loss value with MSE for binary classification.
I am not sure how to justify these obtained results. Completely new to AI and ML techniques.
Like to share my understanding about MSE and cross_binary_entrophy.
In case of classification, we take the argmax() of probability of each training instance.
Now consider an example of binary classifier where model predicts the probability as (.49, .51). In this case model will return "1" as prediction.
Assume if actual label is also "1".
In such case if MSE is used it will return 0 as a loss value, whereas cross_binary_entrophy will return some tangible value.
And if somehow with all data sample, trained model predicts similar type of probability, then cross_binary_entrophy effectively return a big accumulative loss value, whereas MSE will return a 0.
According to MSE, its a perfect model, but in actuality its not a that good model, that's why we should not use MSE for classification.