I have several questions about Keras example pretrained_word_embeddings to increase a level of understanding how it works.
Is it reasonable to use dropout
layer in such model?
Last MaxPooling1D
layer has to cover all output shape every time? At original model, last conv layer output is 35 and we set up maxpool the same 35 value.
Am I right if to say that increase of value 128 (kernels number) will increase accuracy?
Is it make sense to put additional conv layers to increase the accuracy? Even if it will decrease model training phase.
Thank you!
So basically there is one simple answer to your questions - you need to test it:
- Adding
dropout
is usually a good thing. It introduces the reasonable amount of randomization and regularization. The downside is that you need to set the right value of its parameter - which sometimes might take a while.
- In my opinion - the value of a
MaxPooling1D
in the last pooling layer was set in this way in order to reduce the dimensionality of the next layer input. One may check if e.g. values like a half of the number in previous layer (in a presented case - e.g. 18
) - which only doulbes the size of the input to a next layer - could introduce any improvement.
- It's hard to say - if you e.g. have a small amount of data with a really rigid structure - to much parameters might seriously harm your training. The best way is to test different parameters values in either grid or random search paradigm. It's belived that random search does a better job :)