I have few question regarding CNN. In the figure below between Layer S2 and C3, 5*5 sized kernel has been used.
Q1. How many kernel has been used there? Do each of these kernel connected with each of the feature map in Layer S2 ?
Q2. When using Max-pooling, while backpropageting error how a max-pooling feature/neuron knows/determines from which (feature map/neuron) in its previous immediate layer it got the max value ?
Q3. If we want to train kernel then we initialize with random value, is there any equation to update these kernel values using backpropagated error value ?
Q4. In the above figure how the backpropagation works between 'Input' and 'C5' layer after getting error from Layer F6 ?