I have this code:
import numpy as np
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.optimizers import SGD
from sklearn import datasets
import theano
iris = datasets.load_iris()
X = iris.data[:,0:3] # we only take the first two features.
Y = iris.target
X = X.astype(theano.config.floatX)
Y = Y.astype(theano.config.floatX)
model = Sequential()
model.add(Dense(150, 1, init='uniform'))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(150, 1, init='uniform'))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(150, 1, init='uniform'))
model.add(Activation('softmax'))
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='mean_squared_error', optimizer=sgd)
model.fit(X, Y, nb_epoch=20, batch_size=150)
score = model.evaluate(X_train, y_train, batch_size=16)
Returns this error:
ValueError: Shape mismatch: x has 3 cols (and 150 rows) but y has 150 rows (and 1 cols)
Apply node that caused the error: Dot22(<TensorType(float64, matrix)>, <TensorType(float64, matrix)>)
Inputs types: [TensorType(float64, matrix), TensorType(float64, matrix)]
Inputs shapes: [(150L, 3L), (150L, 1L)]
Inputs strides: [(24L, 8L), (8L, 8L)]
Inputs values: ['not shown', 'not shown']
What is the problem?
You specified the wrong output dimensions for your internal layers. See for instance this example from the Keras documentation:
Note how the output size of one layer matches the input size of the next one:
The first number is always the input size (number of neurons on the previous layer), the second number the output size (number of neurons on the next layer). So in this example you have four layers:
The only hard restriction you have is that the first (input) layer needs to have as many neurons as you have features, and the last (output) layer needs to have as many neurons as you need for your task.
For your example, since you have three features, you need to change the input layer size to 3, and you can keep the two output neurons from this example to do binary classification (or use one, as you did, with logistic loss).