-->

Attention mechanism with an LSTM which takes no in

2019-08-26 17:55发布

问题:

I am trying to implement the architecture from the following paper: https://arxiv.org/pdf/1511.06391.pdf.

The parts I am stuck with relate to equations (3) and (7). In particular, the authors specify that this LSTM does not accept any inputs and that the output state q* depends on the hidden state q. However, from my understanding of LSTMs, q* and q must have the same dimensions. Now this is obviously wrong as q*=[q,r], where r is the same dimension as q (from equation 3 in order to make the dot product possible). So, I am misunderstanding something but I do not see what it is.

As a bonus, how would one write an LSTM which takes no input in TensorFlow?

Thanks a lot for your attention!