I used KerasRegressor on a dummy dataset and tried to predict the training values itself. It is giving me an output far from satisfactory. The training data is not random at all. Could anyone help me out?
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasRegressor
import numpy as ny
X = ny.array([[1,2], [3,4], [5,6], [7,8], [9,10]])
Y = ny.array([3, 4, 5, 6, 7])
N = 5
def brain():
#Create the brain
br_model=Sequential()
br_model.add(Dense(3, input_dim=2, kernel_initializer='normal',activation='relu'))
br_model.add(Dense(2, kernel_initializer='normal',activation='relu'))
br_model.add(Dense(1,kernel_initializer='normal'))
#Compile the brain
br_model.compile(loss='mean_squared_error',optimizer='adam')
return br_model
estimator = KerasRegressor(build_fn=brain, nb_epoch=1000000, batch_size=5,verbose=1)
print "Done"
estimator.fit(X,Y)
prediction = estimator.predict(X)
print Y
print prediction
The output is
[3 4 5 6 7]
[0.001 0.001 0.001 0.001 0.001]
Basically, the prediction is 0.001 while actual value is not. I've tried with other network configs but I face the same issue. What must I do/(not do) to get an accurate output??
This is due to a classic mistake made by new practitioners, i.e. not normalizing their data before feeding them into a neural network (see the third point in this answer for the same issue causing similar problems in a classification setting with a convolutional neural network).
(I confess that, in most tutorials I have seen, this crucial point is usually not emphasized strongly enough; and it can be even worse, e.g. in the Tensorflow MNIST For ML Beginners tutorial, it turns out that the data returned by the Tensorflow-provided utility functions are already normalized in [0, 1], transparently to the user and without any hint provided, hence hiding from the reader a crucial step that will certainly need to be done later, when using own data).
So, you need to normalize both your features and your output; keeping your shown
X
andY
data:Then, changing your epochs to just 1000 (you definitely don't need 1 million epochs for these data!), and fitting on the scaled data:
you will get:
Or, scaling your output back to its original range with
inverse_transform
: