Why are there two additional variables, in the che

2020-07-16 17:27发布

问题:

I created a convolutional neural network with three convolutional layers and two fully connected layers. I used tf.train.saver() to save the variables. When I use inspect_checkpoint.py to check the variables saved in the checkpoint file. Why are there two additional variables saved for each layer, like Adam_1 and Adam? Also, what are beta1_power and beta2_power?

conv_layer1_b  (DT_FLOAT)  [32]

conv_layer1_w  (DT_FLOAT)  [1,16,1,32]

conv_layer1_b/Adam  (DT_FLOAT)  [32]

conv_layer1_w/Adam (DT_FLOAT) [1,16,1,32]

conv_layer1_w/Adam_1 (DT_FLOAT) [1,16,1,32]

conv_layer1_b/Adam_1 (DT_FLOAT) [32]

conv_layer3_w/Adam (DT_FLOAT) [1,16,64,64]

conv_layer3_w (DT_FLOAT) [1,16,64,64]

conv_layer3_b/Adam_1 (DT_FLOAT) [64]

conv_layer3_b (DT_FLOAT) [64]

conv_layer3_b/Adam (DT_FLOAT) [64]

conv_layer3_w/Adam_1 (DT_FLOAT) [1,16,64,64]

conv_layer2_w/Adam_1 (DT_FLOAT) [1,16,32,64]

conv_layer2_w/Adam (DT_FLOAT) [1,16,32,64]

conv_layer2_w (DT_FLOAT) [1,16,32,64]

conv_layer2_b/Adam_1 (DT_FLOAT) [64]

conv_layer2_b (DT_FLOAT) [64]

conv_layer2_b/Adam (DT_FLOAT) [64]

beta1_power (DT_FLOAT) []

beta2_power (DT_FLOAT) []

NN1_w (DT_FLOAT) [2432,512]

NN1_b (DT_FLOAT) [512]

NN1_w/Adam_1 (DT_FLOAT) [2432,512]

NN1_b/Adam_1 (DT_FLOAT) [512]

NN1_w/Adam (DT_FLOAT) [2432,512]

NN1_b/Adam (DT_FLOAT) [512]

NN2_w (DT_FLOAT) [512,2]

NN2_b (DT_FLOAT) [2]

NN2_w/Adam_1 (DT_FLOAT) [512,2]

NN2_b/Adam_1 (DT_FLOAT) [2]

NN2_w/Adam (DT_FLOAT) [512,2]

NN2_b/Adam (DT_FLOAT) [2]

回答1:

You're using the Adam optimizer (https://arxiv.org/abs/1412.6980) for optimization. Adam has two state variables to store statistics about the gradients which are the same size as the parameters (Algorithm 1), which is your two additional variables per parameter variable. The optimizer itself has a few hyperparameters, among them β1 and β2, which I guess are in your case stored as variables.