Python keras neural network (Theano) package retur

2020-06-28 06:50发布

问题:

I have this code:

import numpy as np
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.optimizers import SGD
from sklearn import datasets
import theano

iris = datasets.load_iris()
X = iris.data[:,0:3]  # we only take the first two features.
Y = iris.target

X = X.astype(theano.config.floatX)
Y = Y.astype(theano.config.floatX)


model = Sequential()
model.add(Dense(150, 1, init='uniform'))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(150, 1, init='uniform'))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(150, 1, init='uniform'))
model.add(Activation('softmax'))

sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='mean_squared_error', optimizer=sgd)

model.fit(X, Y, nb_epoch=20, batch_size=150)


score = model.evaluate(X_train, y_train, batch_size=16)

Returns this error:

ValueError: Shape mismatch: x has 3 cols (and 150 rows) but y has 150 rows (and 1 cols)
Apply node that caused the error: Dot22(<TensorType(float64, matrix)>, <TensorType(float64, matrix)>)
Inputs types: [TensorType(float64, matrix), TensorType(float64, matrix)]
Inputs shapes: [(150L, 3L), (150L, 1L)]
Inputs strides: [(24L, 8L), (8L, 8L)]
Inputs values: ['not shown', 'not shown']

What is the problem?

回答1:

You specified the wrong output dimensions for your internal layers. See for instance this example from the Keras documentation:

model = Sequential()
model.add(Dense(20, 64, init='uniform'))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(64, 64, init='uniform'))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(64, 2, init='uniform'))
model.add(Activation('softmax'))

Note how the output size of one layer matches the input size of the next one:

20x64 -> 64x64 -> 64x2

The first number is always the input size (number of neurons on the previous layer), the second number the output size (number of neurons on the next layer). So in this example you have four layers:

  • an input layer with 20 neurons
  • a hidden layer with 64 neurons
  • a hidden layer with 64 neurons
  • an output layer with 2 neurons

The only hard restriction you have is that the first (input) layer needs to have as many neurons as you have features, and the last (output) layer needs to have as many neurons as you need for your task.

For your example, since you have three features, you need to change the input layer size to 3, and you can keep the two output neurons from this example to do binary classification (or use one, as you did, with logistic loss).