I have asked a few questions about neural networks on this website in the past and have gotten great answers, but I am still struggling to implement one for myself. This is quite a long question, but I am hoping that it will serve as a guide for other people creating their own basic neural networks in MATLAB, so it should be worth it.
What I have done so far could be completely wrong. I am following the online stanford machine learning course by Professor Andrew Y. Ng and have tried to implement what he has taught to the best of my ability.
Can you please tell me if the feed forward and cost function parts of my code are correct, and where I am going wrong in the minimization (optimization) part?
I have a feed 2 layer feed forward neural network.
The MATLAB code for the feedforward part is:
function [ Y ] = feedforward2( X,W1,W2)
%This takes a row vector of inputs into the neural net with weight matrices W1 and W2 and returns a row vector of the outputs from the neural net
%Remember X, Y, and A can be vectors, and W1 and W2 Matrices
X=transpose(X); %X needs to be a column vector
A = sigmf(W1*X,[1 0]); %Values of the first hidden layer
Y = sigmf(W2*A,[1 0]); %Output Values of the network
Y = transpose(Y); %Y needs to be a column vector
So for example a two layer neural net with two inputs and two outputs would look a bit like this:
a1
x1 o--o--o y1 (all weights equal 1)
\/ \/
/\ /\
x2 o--o--o y2
a2
if we put in:
X=[2,3];
W1=ones(2,2);
W2=ones(2,2);
Y = feedforward2(X,W1,W2)
we get the the output:
Y = [0.5,0.5]
This represents the y1 and y2 values shown in the drawing of the neural net
The MATLAB code for the squared error cost function is:
function [ C ] = cost( W1,W2,Xtrain,Ytrain )
%This gives a value seeing how close W1 and W2 are to giving a network that represents the Xtrain and Ytrain data
%It uses the squared error cost function
%The closer the cost is to zero, the better these particular weights are at giving a network that represents the training data
%If the cost is zero, the weights give a network that when the Xtrain data is put in, The Ytrain data comes out
M = size(Xtrain,1); %Number of training examples
oldsum = 0;
for i = 1:M,
H = feedforward2(Xtrain,W1,W2);
temp = ( H(i) - Ytrain(i) )^2;
Sum = temp + oldsum;
oldsum = Sum;
end
C = (1/2*M) * Sum;
end
Example
So for example if the training data is:
Xtrain =[0,0; Ytrain=[0/57;
1,2; 3/57;
4,1; 5/57;
5,2; 7/57; a1
3,4; 7/57; %This will be for a two input one output network x1 o--o y1
5,3; 8/57; \/ \_o
1,5; 6/57; /\ /
6,2; 8/57; x2 o--o
2,1; 3/57; a2
5,5;] 10/57;]
We start with initial random weights
W1=[2,3; W2=[3,2]
4,1]
If we put in:
Y= feedforward2([6,2],W1,W2)
We get
Y = 0.9933
Which is far from what the training data says it should be (8/57 = 0.1404). So the initial random weights W1 and W2 where a bad guess.
To measure exactly how bad/good a guess the random weights weights are we use the cost function:
C= cost(W1,W2,Xtrain,Ytrain)
This gives the value:
C = 6.6031e+003
Minimizing the cost function
If we minimize the cost function by searching all of the possible variables W1 and W2 and then picking the lowest, this will give the network that best approximates the training data
But when I Use the code:
[W1,W2]=fminsearch(cost(W1,W2,Xtrain,Ytrain),[W1,W2])
It gives an error message. It says: "Error using horzcat. CAT arguments dimensions are not consistent."Why am I getting this error and what can I do to fix it?
Can you please tell me if the feed forward and cost function parts of my code are correct, and where I am going wrong in the minimization (optimization) part?
Thank you!!!