I am trying to build a multivariate time series prediction model. I followed the following tutorial for temperature prediction. http://nbviewer.jupyter.org/github/addfor/tutorials/blob/master/machine_learning/ml16v04_forecasting_with_LSTM.ipynb
I want to extend his model to multilayer LSTM model by using following code:
cell = tf.contrib.rnn.LSTMCell(hidden, state_is_tuple=True)
cell = tf.contrib.rnn.MultiRNNCell([cell] * num_layers,state_is_tuple=True)
output, _ = tf.nn.dynamic_rnn(cell=cell, inputs=features, dtype=tf.float32)
but I have an error saying:
ValueError: Dimensions must be equal, but are 256 and 142 for 'rnn/while/rnn/multi_rnn_cell/cell_0/cell_0/lstm_cell/MatMul_1' (op: 'MatMul') with input shapes: [?,256], [142,512].
When I tried this:
cell = []
for i in range(num_layers):
cell.append(tf.contrib.rnn.LSTMCell(hidden, state_is_tuple=True))
cell = tf.contrib.rnn.MultiRNNCell(cell,state_is_tuple=True)
output, _ = tf.nn.dynamic_rnn(cell=cell, inputs=features, dtype=tf.float32)
I do not have such error but the prediction is really bad.
I define hidden=128
.
features = tf.reshape(features, [-1, n_steps, n_input])
has shape (?,1,14)
for single layer case.
my data look like this x.shape=(594,14), y.shape=(591,1)
I am so confused how to stack LSTM cell in tensorflow. My tensorflow version is 0.14.
This is a very interesting question. Initially, I thought that two codes produce the same output (i.e stacking two LSTM cells).
code 1
code 2
However, If you print the cell in both instances produce something like following,
code 1
code 2
If you closely observe the results,
Stacking two LSTM cells is something like below,
Therefore, If you think about the big picture (actual Tensorflow operation may be different), what it does is,
Therefore, when you trying to do the above two operations to the same copy of LSTM cell (since the dimensions of weight matrices are different), there is an error.
However, if you use the number of hidden units as same the number input units (in your case input is 14 and hidden is 14) there is no error (since the dimensions of weight matrices are the same) although you are using the same LSTM cell.
Therefore, I think your second approach is correct if you are thinking of stacking two LSTM cells.