I am trying to build a 11 class image classifier with 13000 training images and 3000 validation images. I am using deep neural network which is being trained using mxnet. Training accuracy is increasing and reached above 80% but validation accuracy is coming in range of 54-57% and its not increasing. What can be the issue here? Should I increase the no of images?
相关问题
- How to use Reshape keras layer with two None dimen
- Trying to understand Pytorch's implementation
- Convolutional Neural Network seems to be randomly
- How to convert Onnx model (.onnx) to Tensorflow (.
- no module named cntk
相关文章
- How to downgrade to cuda 10.0 in arch linux?
- How to measure overfitting when train and validati
- TensorFlow Eager Mode: How to restore a model from
- loss, val_loss, acc and val_acc do not update at a
- In Tensorflow, how to assign values in Tensor acco
- Why does different batch-sizes give different accu
- Augmenting only the training set in K-folds cross
- How to convert deep learning gradient descent equa
This clearly looks like a case where the model is overfitting the Training set, as the validation accuracy was improving step by step till it got fixed at a particular value. If the learning rate was a bit more high, you would have ended up seeing validation accuracy decreasing, with increasing accuracy for training set.
Increasing the number of training set is the best solution to this problem. You could also try applying different transformations (flipping, cropping random portions from a slightly bigger image)to the existing image set and see if the model is learning better.
The issue here is that your network stop learning useful general features at some point and start adapting to peculiarities of your training set (overfitting it in result). You want to 'force' your network to keep learning useful features and you have few options here:
Unfortunately the process of training network that generalizes well involves a lot of experimentation and almost brute force exploration of parameter space with a bit of human supervision (you'll see many research works employing this approach). It's good to try 3-5 values for each parameter and see if it leads you somewhere.
When you experiment plot accuracy / cost / f1 as a function of number of iterations and see how it behaves. Often you'll notice a peak in accuracy for your test set, and after that a continuous drop. So apart from good architecture, regularization, corruption etc. you're also looking for a good number of iterations that yields best results.
One more hint: make sure each training epochs randomize the order of images.