Quick answer:
This is in fact really easy. Here's the code (for those who don't want to read all that text):
inputs=Input((784,))
encode=Dense(10, input_shape=[784])(inputs)
decode=Dense(784, input_shape=[10])
model=Model(input=inputs, output=decode(encode))
inputs_2=Input((10,))
decode_model=Model(input=inputs_2, output=decode(inputs_2))
In this setup, the decode_model
will use the same decode layer as the model
.
If you train the model
, the decode_model
will be trained, too.
Actual question:
I'm trying to create a simple autoencoder for MNIST in Keras:
This is the code so far:
model=Sequential()
encode=Dense(10, input_shape=[784])
decode=Dense(784, input_shape=[10])
model.add(encode)
model.add(decode)
model.compile(loss="mse",
optimizer="adadelta",
metrics=["accuracy"])
decode_model=Sequential()
decode_model.add(decode)
I'm training it to learn the identity function
model.fit(X_train,X_train,batch_size=50, nb_epoch=10, verbose=1,
validation_data=[X_test, X_test])
The reconstruction is quite interesting:
But I would also like to look at the representations of cluster. What is the output of passing [1,0...0] to the decoding layer ? This should be the "cluster-mean" of one class in MNIST.
In order to do that I created a second model decode_model
, which reuses the decoder layer.
But if I try to use that model, it complains:
Exception: Error when checking : expected dense_input_5 to have shape (None, 784) but got array with shape (10, 10)
That seemed strange. It's simply a dense layer, the Matrix wouldn't even be able to process 784-dim input. I decided to look at the model summary:
____________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
====================================================================================================
dense_14 (Dense) (None, 784) 8624 dense_13[0][0]
====================================================================================================
Total params: 8624
It is connected to dense_13. It's difficult to keep track of the names of the layers, but that looks like the encoder layer. Sure enough, the model summary of the whole model is:
____________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
====================================================================================================
dense_13 (Dense) (None, 10) 7850 dense_input_6[0][0]
____________________________________________________________________________________________________
dense_14 (Dense) (None, 784) 8624 dense_13[0][0]
====================================================================================================
Total params: 16474
____________________
Apparently the layers are permanently connected.
Strangely there is no input layer in my decode_model
.
How can I reuse a layer in Keras ? I've looked at the functional API, but there too, layers are fused together.
Oh, nevermind.
I should have read the entire functional API: https://keras.io/getting-started/functional-api-guide/#shared-layers
Here's one of the predictions (maybe still lacking some training):
I'm guessing this could be a 3 ? Well at least it works now.
And for those with similar problems, here's the updated code:
I only compiled one of the models. For training you need to compile a model, for prediction that is not necessary.