Initializing LSTM hidden state Tensorflow/Keras

Can someone explain how can I initialize hidden state of LSTM in tensorflow? I am trying to build LSTM recurrent auto-encoder, so after i have that model trained i want to transfer learned hidden state of unsupervised model to hidden state of supervised model. Is that even possible with current API? This is paper I am trying to recreate:

http://papers.nips.cc/paper/5949-semi-supervised-sequence-learning.pdf

标签： tensorflow neural-network deep-learning keras lstm

2条回答

Explosion°爆炸

2楼-- · 2019-01-24 13:17

Assuming an RNN is in layer 1 and hidden/cell states are numpy arrays. You can do this:

from keras import backend as K

K.set_value(model.layers[1].states[0], hidden_states)
K.set_value(model.layers[1].states[1], cell_states)

States can also be set using

model.layers[1].states[0] = hidden_states
model.layers[1].states[1] = cell_states

but when I did it this way my state values stayed constant even after stepping the RNN.

0人赞添加讨论(0) 举报

劫难

3楼-- · 2019-01-24 13:24

Yes - this is possible but truly cumbersome. Let's go through an example.

Defining a model:
```
from keras.layers import LSTM, Input
from keras.models import Model

input = Input(batch_shape=(32, 10, 1))
lstm_layer = LSTM(10, stateful=True)(input)

model = Model(input, lstm_layer)
model.compile(optimizer="adam", loss="mse")
```
It's important to build and compile model first as in compilation the initial states are reset. Moreover - you need to specify a batch_shape where batch_size is specified as in this scenario our network should be stateful (which is done by setting a stateful=True mode.

Now we could set the values of initial states:

import numpy
import keras.backend as K

hidden_states = K.variable(value=numpy.random.normal(size=(32, 10)))
cell_states = K.variable(value=numpy.random.normal(size=(32, 10)))

model.layers[1].states[0] = hidden_states
model.layers[1].states[1] = cell_states

Note that you need to provide states as a keras variables. states[0] holds hidden states and states[1] holds cell states.

Hope that helps.

0人赞添加讨论(0) 举报

Initializing LSTM hidden state Tensorflow/Keras

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间