Neural Networks Not Immediatelly Reproducible?

2019-05-19 21:24发布

问题:

With the random weight initializations in a feed-forward neural network using a derivative of backpropogation (resilient), the initial position on the error graph is atop some random valley which may or may not be a local minimum. It is possible to use methods to overcome local minimum, but assuming these are not used (or don't work well on the given terrain) it would seem that the neural network is not immediately reproducible. Meaning, there is a bit of luck involved.

On a graph with 1 Million valleys and only one global minimum, it would take a maximum of 1 Million Non-Repeating weight initializations to find that single global minimum (assuming each initialization landed you in a completely different valley). So, if I come across a convergence on a global minimum (the solution) and accidentally lose the weights (data corruption, or whatever), I may not come across the convergence again. With this in mind, should I be running a multitude of neural networks concurrently on the same data set and then halt/save the network once I reach my desired error rate? I am aware that running multiple networks simultaneously is a common practice, but this would seem to make a semi-brute force technique like this mandatory.

回答1:

Correct

You are indeed in dire need to store those calculated weights somewhere safe. You are not guaranteed to rediscover the same (possibly) optimal solution again, given a difficult search landscape.

Multiple restarts

It is very common to use random restarts and keep track of the current best solution.

Litterature on learning in NN

See the great book extract: A Systematic Introduction