Why is a simple 2-layer Neural Network unable to l

While going through the example of a tiny 2-layer neural network I noticed the result that I cannot explain.

Imagine we have the following dataset with the corresponding labels:

[0,1] -> [0]
[0,1] -> [0]
[1,0] -> [1]
[1,0] -> [1]

Let's create a tiny 2-layer NN which will learn to predict the outcome of a two number sequence where each number can be 0 or 1. We shall train this NN given our dataset mentioned above.

    import numpy as np

    # compute sigmoid nonlinearity
    def sigmoid(x):
        output = 1 / (1 + np.exp(-x))
        return output

    # convert output of sigmoid function to its derivative
    def sigmoid_to_deriv(output):
        return output * (1 - output)

    def predict(inp, weigths):
        print inp, sigmoid(np.dot(inp, weigths))

    # input dataset
    X = np.array([ [0,1],
                   [0,1],
                   [1,0],
                   [1,0]])
    # output dataset
    Y = np.array([[0,0,1,1]]).T

    np.random.seed(1)

    # init weights randomly with mean 0
    weights0 = 2 * np.random.random((2,1)) - 1

    for i in xrange(10000):
        # forward propagation
        layer0 = X
        layer1 = sigmoid(np.dot(layer0, weights0))
        # compute the error
        layer1_error = layer1 - Y

        # gradient descent
        # calculate the slope at current x position
        layer1_delta = layer1_error * sigmoid_to_deriv(layer1)
        weights0_deriv = np.dot(layer0.T, layer1_delta)
        # change x by the negative of the slope (x = x - slope)
        weights0 -= weights0_deriv

    print 'INPUT   PREDICTION'
    predict([0,1], weights0)
    predict([1,0], weights0)
    # test prediction of the unknown data
    predict([1,1], weights0)
    predict([0,0], weights0)

After we've trained this NN we test it.

INPUT   PREDICTION
[0, 1] [ 0.00881315]
[1, 0] [ 0.99990851]
[1, 1] [ 0.5]
[0, 0] [ 0.5]

Ok, 0,1 and 1,0 is what we would expect. The predictions for 0,0 and 1,1 are also explainable, our NN just didn't have the training data for these cases, so let's add it into our training dataset:

[0,1] -> [0]
[0,1] -> [0]
[1,0] -> [1]
[1,0] -> [1]
[0,0] -> [0]
[1,1] -> [1]

Retrain the network and test it again!

INPUT   PREDICTION
[0, 1] [ 0.00881315]
[1, 0] [ 0.99990851]
[1, 1] [ 0.9898148]
[0, 0] [ 0.5]

Wait, why [0,0] is still 0.5?

This means that NN is still uncertain about 0,0, same when it was uncertain about 1,1 until we trained it.

标签： python machine-learning neural-network

1条回答

看我几分像从前

2楼-- · 2020-05-19 07:13

The classification is right as well. You need to understand that the net was able to separate the test set.

Now You need to use an step function to classify the data between 0 or 1.

In your case the 0.5 seems to be a good threshold

EDIT:

You need to add the bias to the code.

# input dataset
X = np.array([ [0,0,1],
               [0,0,1],
               [0,1,0],
               [0,1,0]])

# init weights randomly with mean 0
weights0 = 2 * np.random.random((3,1)) - 1

0人赞添加讨论(0) 举报

Why is a simple 2-layer Neural Network unable to l

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间