Can someone explain how can I initialize hidden state of LSTM in tensorflow? I am trying to build LSTM recurrent auto-encoder, so after i have that model trained i want to transfer learned hidden state of unsupervised model to hidden state of supervised model. Is that even possible with current API? This is paper I am trying to recreate:
http://papers.nips.cc/paper/5949-semi-supervised-sequence-learning.pdf
Assuming an RNN is in layer 1 and hidden/cell states are numpy arrays. You can do this:
States can also be set using
but when I did it this way my state values stayed constant even after stepping the RNN.
Yes - this is possible but truly cumbersome. Let's go through an example.
Defining a model:
It's important to build and compile model first as in compilation the initial states are reset. Moreover - you need to specify a
batch_shape
wherebatch_size
is specified as in this scenario our network should bestateful
(which is done by setting astateful=True
mode.Now we could set the values of initial states:
Note that you need to provide states as a
keras
variables.states[0]
holds hidden states andstates[1]
holds cell states.Hope that helps.