I was making a seq2seq model in keras. I had built single layer encoder and decoder and they were working fine. But now I want to extend it to multi layer encoder and decoder. I am building it using Keras Functional API.
Training:-
Code for encoder:-
encoder_input=Input(shape=(None,vec_dimension))
encoder_lstm=LSTM(vec_dimension,return_state=True,return_sequences=True)(encoder_input)
encoder_lstm=LSTM(vec_dimension,return_state=True)(encoder_lstm)
encoder_output,encoder_h,encoder_c=encoder_lstm
Code for decoder:-
encoder_state=[encoder_h,encoder_c]
decoder_input=Input(shape=(None,vec_dimension))
decoder_lstm= LSTM(vec_dimension,return_state=True,return_sequences=True (decoder_input,initial_state=encoder_state)
decoder_lstm=LSTM(vec_dimension,return_state=True,return_sequences=True)(decoder_lstm)
decoder_output,_,_=decoder_lstm
For testing :-
encoder_model=Model(inputs=encoder_input,outputs=encoder_state)
decoder_state_input_h=Input(shape=(None,vec_dimension))
decoder_state_input_c=Input(shape=(None,vec_dimension))
decoder_states_input=[decoder_state_input_h,decoder_state_input_c]
decoder_output,decoder_state_h,decoder_state_c =decoder_lstm #(decoder_input,initial_state=decoder_states_input)
decoder_states=[decoder_state_h,decoder_state_c]
decoder_model=Model(inputs=[decoder_input]+decoder_states_input,outputs=[decoder_output]+decoder_states)
Now when I try to increase the no. of layers in the decoder for training then training works fine but for testing it dosen't works and throws error.
Actually the problem is when making it multi layer i had shifted the initial_state
to a middle layer which used to be specified at the end.So
when I am calling it during testing, it is throwing errors.
RuntimeError: Graph disconnected: cannot obtain value for tensor Tensor("input_64:0", shape=(?, ?, 150), dtype=float32) at layer "input_64".The following previous layers were accessed without issue: []
How should I pass the initial_state=decoder_states_input
which is for the input layer so that it doesn't throws error.
How should I pass the initial_state=decoder_states_input
in the end layer for for the first Input layer??
EDIT:-
In that code I have tried to make multiple layers of decoder LSTM. But that's giving error. When working with single layer.The correct codes are:-
Encoder(Training):-
encoder_input=Input(shape=(None,vec_dimension))
encoder_lstm =LSTM(vec_dimension,return_state=True)(encoder_input)
encoder_output,encoder_h,encoder_c=encoder_lstm
Decoder(Training):-
encoder_state=[encoder_h,encoder_c]
decoder_input=Input(shape=(None,vec_dimension))
decoder_lstm= LSTM(vec_dimension, return_state=True, return_sequences=True)
decoder_output,_,_=decoder_lstm(decoder_input,initial_state=encoder_state)
Decoder(Testing)
decoder_output,decoder_state_h,decoder_state_c=decoder_lstm( decoder_input, initial_state=decoder_states_input)
decoder_states=[decoder_state_h,decoder_state_c]
decoder_output,decoder_state_h,decoder_state_c=decoder_lstm (decoder_input,initial_state=decoder_states_input)
decoder_model=Model(inputs=[decoder_input]+decoder_states_input,outputs=[decoder_output]+decoder_states)
EDIT - Updated to use the functional API model in Keras vs. the RNN
And here is the inference setup:
Lastly, if you are following the Keras seq2seq example, you will have to change the prediction script as there are multiple hidden states that need to be managed vs. just two of them in the single-layer example. There will be 2x the number of layer hidden states
I've generalized Jeremy Wortz's awesome answer to create the model from a list, 'latent_dims', which will be 'len(latent_dims)' deep, as opposed to a fixed 2-deep.
Starting with the 'latent_dims' declaration:
Creating the model for training:
For inference it's as follows:
And finally a few modifications to Jeremy Wortz's 'decode_sequence' function are implemented to get the following: