Tensorflow r0.12's documentation for tf.nn.rnn_cell.LSTMCell describes this as the init:
tf.nn.rnn_cell.LSTMCell.__call__(inputs, state, scope=None)
where state
is as follows:
state: if state_is_tuple is False, this must be a state Tensor, 2-D, batch x state_size. If state_is_tuple is True, this must be a tuple of state Tensors, both 2-D, with column sizes c_state and m_state.
What aare c_state
and m_state
and how do they fit into LSTMs? I cannot find reference to them anywhere in the documentation.
https://github.com/tensorflow/tensorflow/blob/r1.2/tensorflow/python/ops/rnn_cell_impl.py
Line #308 - 314
class LSTMStateTuple(_LSTMStateTuple): """Tuple used by LSTM Cells for
state_size
,zero_state
, and output state. Stores two elements:(c, h)
, in that order. Only used whenstate_is_tuple=True
. """I've stumbled upon same question, here's how I understand it! Minimalistic LSTM example:
Notice that
state_is_tuple=True
so when passingstate
to thiscell
, it needs to be in thetuple
form.c_state
andm_state
are probably "Memory State" and "Cell State", though I honestly am NOT sure, as these terms are only mentioned in the docs. In the code and papers aboutLSTM
- lettersh
andc
are commonly used to denote "output value" and "cell state". http://colah.github.io/posts/2015-08-Understanding-LSTMs/ Those tensors represent combined internal state of the cell, and should be passed together. Old way to do it was to simply concatenate them, and new way is to use tuples.OLD WAY:
NEW WAY:
So, basically all we did, is changed
state
from being 1 tensor of length4
into two tensors of length2
. The content remained the same.[0,0,0,0]
becomes([0,0],[0,0])
. (This is supposed to make it faster)I agree that the documentation is unclear. Looking at
tf.nn.rnn_cell.LSTMCell.__call__
clarifies (I took the code from TensorFlow 1.0.0):The key lines are:
and
and
If you compare the code to compute
c
andm
with the LSTM equations (see below), you can see it corresponds to the cell state (typically denoted withc
) and hidden state (typically denoted withh
), respectively:new_state = (LSTMStateTuple(c, m)
indicates that the first element of the returned state tuple isc
(cell state a.k.a.c_state
), and the second element of the returned state tuple ism
(hidden state a.k.a.m_state
).Maybe this excerpt from the code will help