Attention mechanism with an LSTM which takes no in

2019-08-26 17:55发布

站内文章 / 前沿技术

86 0

在下西门庆

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I am trying to implement the architecture from the following paper: https://arxiv.org/pdf/1511.06391.pdf.

The parts I am stuck with relate to equations (3) and (7). In particular, the authors specify that this LSTM does not accept any inputs and that the output state q* depends on the hidden state q. However, from my understanding of LSTMs, q* and q must have the same dimensions. Now this is obviously wrong as q*=[q,r], where r is the same dimension as q (from equation 3 in order to make the dot product possible). So, I am misunderstanding something but I do not see what it is.

As a bonus, how would one write an LSTM which takes no input in TensorFlow?

Thanks a lot for your attention!