Keras dimensionality in convolutional layer mismat

2019-07-11 16:28发布

问题:

I'm trying to play around with Keras to build my first neural network. I have zero experience and I can't seem to figure out why my dimensionality isn't right. I can't figure it out from their docs what this error is complaining about, or even what layer is causing it.

My model takes in a 32byte array of numbers, and is supposed to give a boolean value on the other side. I want a 1D convolution on the input byte array.

arr1 is the 32byte array, arr2 is an array of booleans.

inputData = np.array(arr1)
inputData = np.expand_dims(inputData, axis = 2)

labelData = np.array(arr2)

print inputData.shape
print labelData.shape

model = k.models.Sequential()
model.add(k.layers.convolutional.Convolution1D(32,2, input_shape = (32, 1)))
model.add(k.layers.Activation('relu'))

model.add(k.layers.convolutional.Convolution1D(32,2))
model.add(k.layers.Activation('relu'))

model.add(k.layers.convolutional.Convolution1D(32,2))
model.add(k.layers.Activation('relu'))

model.add(k.layers.convolutional.Convolution1D(32,2))
model.add(k.layers.Activation('relu'))

model.add(k.layers.core.Dense(32))
model.add(k.layers.Activation('sigmoid'))

model.compile(loss = 'binary_crossentropy',
              optimizer = 'rmsprop',
              metrics=['accuracy'])
model.fit(
    inputData,labelData
)

The output of the print of shapes are (1000, 32, 1) and (1000,)

The error I receive is:

Traceback (most recent call last): File "cnn/init.py", line 50, in inputData,labelData File "/home/steve/Documents/cnn/env/local/lib/python2.7/site-packages/keras/models.py", line 863, in fit initial_epoch=initial_epoch) File "/home/steve/Documents/cnn/env/local/lib/python2.7/site-packages/keras/engine/training.py", line 1358, in fit batch_size=batch_size) File "/home/steve/Documents/cnn/env/local/lib/python2.7/site-packages/keras/engine/training.py", line 1238, in _standardize_user_data exception_prefix='target') File "/home/steve/Documents/cnn/env/local/lib/python2.7/site-packages/keras/engine/training.py", line 128, in _standardize_input_data str(array.shape)) ValueError: Error when checking target: expected activation_5 to have 3 dimensions, but got array with shape (1000, 1)

回答1:

Well It seems to me that you need to google a bit more about convolutional networks :-)

You are applying at each step 32 filters of length 2 over yout sequence. So if we follow the dimensions of the tensors after each layer :

Dimensions : (None, 32, 1)

model.add(k.layers.convolutional.Convolution1D(32,2, input_shape = (32, 1)))
model.add(k.layers.Activation('relu'))

Dimensions : (None, 31, 32) (your filter of length 2 goes over the whole sequence so the sequence is now of length 31)

model.add(k.layers.convolutional.Convolution1D(32,2))
model.add(k.layers.Activation('relu'))

Dimensions : (None, 30, 32) (you lose again one value because of your filters of length 2, but you still have 32 of them)

model.add(k.layers.convolutional.Convolution1D(32,2))
model.add(k.layers.Activation('relu'))

Dimensions : (None, 29, 32) (same...)

model.add(k.layers.convolutional.Convolution1D(32,2))
model.add(k.layers.Activation('relu'))

Dimensions : (None, 28, 32)

Now you want to use a Dense layer on top of that... the thing is that the Dense layer will work as follow on your 3D input :

model.add(k.layers.core.Dense(32))
model.add(k.layers.Activation('sigmoid'))

Dimensions : (None, 28, 32)

This is your output. First thing that I find weird is that you want 32 outputs out of your dense layer... You should have put 1 instead of 32. But even this will not fix your problem. See what happens if we change the last layer :

model.add(k.layers.core.Dense(1))
model.add(k.layers.Activation('sigmoid'))

Dimensions : (None, 28, 1)

This happens because you apply a dense layer to a '2D' tensor. What it does in case you apply a dense(1) layer to an input [28, 32] is that it produces a weight matrix of shape (32,1) that it applies to the 28 vectors so that you find yourself with 28 outputs of size 1.

What I propose to fix this is to change the last 2 layers like this :

model = k.models.Sequential()
model.add(k.layers.convolutional.Convolution1D(32,2, input_shape = (32, 1)))
model.add(k.layers.Activation('relu'))

model.add(k.layers.convolutional.Convolution1D(32,2))
model.add(k.layers.Activation('relu'))

model.add(k.layers.convolutional.Convolution1D(32,2))
model.add(k.layers.Activation('relu'))

# Only use one filter so that the output will be a sequence of 28 values, not a matrix.
model.add(k.layers.convolutional.Convolution1D(1,2))
model.add(k.layers.Activation('relu'))

# Change the shape from (None, 28, 1) to (None, 28)
model.add(k.layers.core.Flatten())

# Only one neuron as output to get the binary target.
model.add(k.layers.core.Dense(1))
model.add(k.layers.Activation('sigmoid'))

Now the last two steps will take your tensor from

(None, 29, 32) -> (None, 28, 1) -> (None, 28) -> (None, 1)

I hope this helps you.

ps. if you were wondering what None is , it's the dimension of the batch, you don't feed the 1000 samples at onces, you feed it batch by batch and as the value depends on what is chosen, by convension we put None.

EDIT :

Explaining a bit more why the sequences length loses one value at each step.

Say you have a sequence of 4 values [x1 x2 x3 x4], you want to use your filter of length 2 [f1 f2] to convolve over the sequence. The first value will be given by y1 = [f1 f2] * [x1 x2], the second will be y2 = [f1 f2] * [x2 x3], the third will be y3 = [f1 f2] * [x3 x4]. Then you reached the end of your sequence and cannot go further. You have as a result a sequnce [y1 y2 y3].

This is due to the filter length and the effects at the borders of your sequence. There are multiple options, some pad the sequence with 0's in order to get exactly the same length of output... You can chose that option with the parameter 'padding'. You can read more about this here and find the different values possible for the padding argument here. I encourage you to read this last link, it gives informations about input and output shapes...

From the doc :

padding: One of "valid" or "same" (case-insensitive). "valid" means "no padding". "same" results in padding the input such that the output has the same length as the original input.

the default is 'valid', so you don't pad in your example.

I also recommend you to upgrade your keras version to the latest. Convolution1D is now Conv1D, so you might find the doc and tutorials confusing.