What are the causes and possible solutions to alwa

2019-07-11 03:48发布

Im trying to solve a signature recognition problem. Im using GPDS database and I merged all the combinations of Genuine and Forgery signatures that resulted in a 4 million inputs of 200x200 pixel image.

I created a basic CNN using Keras and, due to the limitations of my hardware, Im using just around 5000 inputs and a maximum of 10 epochs for training. My problem is that, when I start training the model (model.fit command), my accuracy varies around the 50% which is the balance of my dataset and when the epoch finishes, the accuracy is exactly 50%. When I try to predict some results after training, the predictions are all the same (all 1s which means genuine signature, for example).

Not sure if it is a problem of:

  • Local minima
  • Small dataset for the complexity of the problem
  • Wrong initialization values for weights, learning rate, momentum…
  • Not enough training
  • Network pretty simple for the problem

Im new in working with Neural Networks so maybe it is just basic problem, anyway, could anyone help me??

Code is below:

model = Sequential()
model.add(Conv2D(100, (5, 5), input_shape=(1, 200, 200), activation='relu', data_format='channels_first'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer=’adam’, metrics=['accuracy'])
model.fit(x = x, y = y, batch_size = 100, shuffle = True, epochs=10)

2条回答
Deceive 欺骗
2楼-- · 2019-07-11 03:57

You are using relu activation (max(0,x)) before a sigmoid, my guess is that (depending on how your layers were initialized) you are saturating the sigmoid.

Sigmoid and it's derivative

Saturating a sigmoid results in null gradients and thus, no learning.

查看更多
Root(大扎)
3楼-- · 2019-07-11 04:11

A good debugging technique for neural networks is to see if you can overfit two training batches of examples. I'd recommend doing that and seeing what happens. If it doesn't go to 0 training loss, then the model is far too simple for the problem.

查看更多
登录 后发表回答