I have built a model using this as a base.
And the train portion from this code.
This model does not train and always gives the cost/loss output for every iteration.
I dont think it learns anything.
I have checked for the usual stuff like shuffled inputs. Making sure that each batch is new.
Any idea why ?
This is my code.
Output
Iter 1280, Minibatch Loss= 4.615120, Training Accuracy= 0.03125
Testing Accuracy: 0.0
Iter 2560, Minibatch Loss= 4.615120, Training Accuracy= 0.00000
Testing Accuracy: 0.0
Iter 3840, Minibatch Loss= 4.615120, Training Accuracy= 0.00000
Testing Accuracy: 0.015625
Iter 5120, Minibatch Loss= 4.615120, Training Accuracy= 0.00000
Testing Accuracy: 0.078125
Iter 6400, Minibatch Loss= 4.615120, Training Accuracy= 0.03125
Testing Accuracy: 0.0
Iter 7680, Minibatch Loss= 4.615120, Training Accuracy= 0.03125
Testing Accuracy: 0.015625
Iter 8960, Minibatch Loss= 4.615120, Training Accuracy= 0.00000
Testing Accuracy: 0.0
Iter 10240, Minibatch Loss= 4.615120, Training Accuracy= 0.00000
Testing Accuracy: 0.015625
Iter 11520, Minibatch Loss= 4.615120, Training Accuracy= 0.00000
Testing Accuracy: 0.0
Iter 12800, Minibatch Loss= 4.615120, Training Accuracy= 0.01562
Testing Accuracy: 0.03125
Iter 14080, Minibatch Loss= 4.615120, Training Accuracy= 0.01562
Testing Accuracy: 0.0
Iter 15360, Minibatch Loss= 4.615120, Training Accuracy= 0.01562
Testing Accuracy: 0.0
The code you started from is just a benchmark of the forward and backward pass and isn't designed to do training. You should start from an example that actually trains a model instead and ignore the benchmark code.
You might have an easier time starting from a completely working training example program instead of trying to combine two pieces.