XOR problem solvable with 2x2x1 neural network wit

2019-01-26 14:42发布

Is a Neural network with 2 input nodes, 2 hidden nodes and an output supposed to be able to solve the XOR problem provided there is no bias? Or can it get stuck?

4条回答
贼婆χ
2楼-- · 2019-01-26 14:59

Yes, you can if you use an activation function like Relu (f(x) =max(0,x))

Example of weights of such network are:

Layer1: [[-1, 1], [1, -1]]
Layer2: [[1], [1]]

For the first (hidden) layer:

  • If the input is [0,0], both nodes will have an activation of 0: ReLU(-1*0 + 1*0) = 0, ReLU(1*0 + -1*0) = 0
  • If the input is [1,0], one node will have activation of 0 ReLU(-1*1 + 1*0) = 0 and the other activation of 1 ReLU(1*1 + -1*0) = 1
  • If the input is [0,1], one node will have activation of 1 ReLu(-1*0 + 1*1) = 1 and the other activation of 0 ReLU(1*0 + -1*1) = 0
  • If the input is [1,1], both nodes will have an activation of 0: ReLU(-1*1 + 1*1 = 0) = 0, ReLU(1*1 + -1*1 = 0) = 0

For the second (output) layer: Since the weights are [[1], [1]] (and there can be no negative activations from previous layer due to ReLU), the layer simply acts as a summation of activations in layer 1

  • If the input is [0,0], the sum of activations in the previous layer is 0
  • If the input is [1,0], the sum of activations in the previous layer is 1
  • If the input is [0,1], the sum of activations in the previous layer is 1
  • If the input is [1,1], the sum of activations in the previous layer is 0

While this method coincidentally works in the example above, it is limited to using zero (0) label for False examples of the XOR problem. If, for example, we used ones for False examples and twos for True examples, this approach would not work anymore.

查看更多
等我变得足够好
3楼-- · 2019-01-26 15:07

If I remember correctly it's not possible to have XOR without a bias.

查看更多
▲ chillily
4楼-- · 2019-01-26 15:13

I have built a neural network without bias and a 2x2x1 architecture solves XOR in 280 epochs. Am new to this, so didn't know either way, but it works, so it is possible.

Regards,

查看更多
迷人小祖宗
5楼-- · 2019-01-26 15:24

Leave the bias in. It doesn't see the values of your inputs.

In terms of a one-to-one analogy, I like to think of the bias as the offsetting c-value in the straight line equation: y = mx + c; it adds an independent degree of freedom to your system that is not influenced by the inputs to your network.

查看更多
登录 后发表回答