I try to build a Stacked Autoencoder in Keras (tf.keras). By stacked I do not mean deep. All the examples I found for Keras are generating e.g. 3 encoder layers, 3 decoder layers, they train it and they call it a day. However, it seems the correct way to train a Stacked Autoencoder (SAE) is the one described in this paper: Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion
In short, a SAE should be trained layer-wise as shown in the image below. After layer 1 is trained, it's used as input to train layer 2. The reconstruction loss should be compared with the layer 1 and not the input layer.
And here is where my trouble begins. How to tell Keras which layers to use the loss function on?
Here is what I do. Since the Autoencoder module is not existed anymore in Keras, I build the first autoencoder, and I set its encoder's weights (trainable = False
) in the 1st layer of a second autoencoder with 2 layers in total. Then when I train that, it obviously compares the reconstructed layer out_s2
with the input layer in_s
, instead of the layer 1 hid1
.
# autoencoder layer 1
in_s = tf.keras.Input(shape=(input_size,))
noise = tf.keras.layers.Dropout(0.1)(in_s)
hid = tf.keras.layers.Dense(nodes[0], activation='relu')(noise)
out_s = tf.keras.layers.Dense(input_size, activation='sigmoid')(hid)
ae_1 = tf.keras.Model(in_s, out_s, name="ae_1")
ae_1.compile(optimizer='nadam', loss='binary_crossentropy', metrics=['acc'])
# autoencoder layer 2
hid1 = tf.keras.layers.Dense(nodes[0], activation='relu')(in_s)
noise = tf.keras.layers.Dropout(0.1)(hid1)
hid2 = tf.keras.layers.Dense(nodes[1], activation='relu')(noise)
out_s2 = tf.keras.layers.Dense(nodes[0], activation='sigmoid')(hid2)
ae_2 = tf.keras.Model(in_s, out_s2, name="ae_2")
ae_2.layers[0].set_weights(ae_1.layers[0].get_weights())
ae_2.layers[0].trainable = False
ae_2.compile(optimizer='nadam', loss='binary_crossentropy', metrics=['acc'])
The solution should be fairly easy, but I can't see it nor find it online. How do I do that in Keras?