I am very confused by these two parameters in the conv1d layer from keras: https://keras.io/layers/convolutional/#conv1d
the documentation says:
filters: Integer, the dimensionality of the output space (i.e. the number output of filters in the convolution).
kernel_size: An integer or tuple/list of a single integer, specifying the length of the 1D convolution window.
But that does not seem to relate to the standard terminologies I see on many tutorials such as https://adeshpande3.github.io/adeshpande3.github.io/A-Beginner's-Guide-To-Understanding-Convolutional-Neural-Networks/ and https://machinelearningmastery.com/sequence-classification-lstm-recurrent-neural-networks-python-keras/
Using the second tutorial link which uses Keras, I'd imagine that in fact 'kernel_size' is relevant to the conventional 'filter' concept which defines the sliding window on the input feature space. But what about the 'filter' parameter in conv1d? What does it do?
For example, in the following code snippet:
model.add(embedding_layer)
model.add(Dropout(0.2))
model.add(Conv1D(filters=100, kernel_size=4, padding='same', activation='relu'))
suppose the embedding layer outputs a matrix of dimension 50 (rows, each row is a word in a sentence) x 300 (columns, the word vector dimension), how does the conv1d layer transforms that matrix?
Many thanks
You're right to say that
kernel_size
defines the size of the sliding window.The
filters
parameters is just how many different windows you will have. (All of them with the same length, which iskernel_size
). How many different results or channels you want to produce.When you use
filters=100
andkernel_size=4
, you are creating 100 different filters, each of them with length 4. The result will bring 100 different convolutions.Also, each filter has enough parameters to consider all input channels.
The Conv1D layer expects these dimensions:
I suppose the best way to use it is to have the number of words in the length dimension (as if the words in order formed a sentence), and the channels be the output dimension of the embedding (numbers that define one word).
So:
The convolutional layer will pass 100 different filters, each filter will slide along the
length
dimension (word by word, in groups of 4), considering all the channels that define the word.The outputs are shaped as:
The filters are shaped as: