可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

Here is what I have tried:

tf.reset_default_graph()
X = tf.placeholder(tf.float32, [None, n_steps, n_inputs])
y = tf.placeholder(tf.float32, [None,n_outputs])
layers = [tf.contrib.rnn.LSTMCell(num_units=n_neurons, 
                                 activation=tf.nn.leaky_relu, use_peepholes = True)
         for layer in range(n_layers)]
multi_layer_cell = tf.contrib.rnn.MultiRNNCell(layers)
rnn_outputs, states = tf.nn.dynamic_rnn(multi_layer_cell, X, dtype=tf.float32)
tf.summary.histogram("outputs", rnn_outputs)
tf.summary.image("RNN",rnn_outputs)

I am getting the following error:

InvalidArgumentError: Tensor must be 4-D with last dim 1, 3, or 4, not [55413,4,100]
     [[Node: RNN_1 = ImageSummary[T=DT_FLOAT, bad_color=Tensor<type: uint8 shape: [4] values: 255 0 0...>, max_images=3, _device="/job:localhost/replica:0/task:0/device:CPU:0"](RNN_1/tag, rnn/transpose_1)]]

Kindly, help me get the visualization of the rnn inside the LSTM model that I am trying to run. This will help in understanding what LSTM is doing more accurately.

回答1:

You can plot each RNN output as an image with one axis being the time and the other axis being the output. Here is an small example:

import tensorflow as tf
import numpy as np

n_steps = 100
n_inputs = 10
n_neurons = 10
n_layers = 3

x = tf.placeholder(tf.float32, [None, n_steps, n_inputs])
layers = [tf.contrib.rnn.LSTMCell(num_units=n_neurons,
                                  activation=tf.nn.leaky_relu, use_peepholes=True)
         for layer in range(n_layers)]
multi_layer_cell = tf.contrib.rnn.MultiRNNCell(layers)
rnn_outputs, states = tf.nn.dynamic_rnn(multi_layer_cell, x, dtype=tf.float32)
# Time steps in horizontal axis, outputs in vertical axis, add last dimension for channel
rnn_out_imgs = tf.transpose(rnn_outputs, (0, 2, 1))[..., tf.newaxis]
out_img_sum = tf.summary.image("RNN", rnn_out_imgs, max_outputs=10)
init_op = tf.global_variables_initializer()
with tf.Session() as sess, tf.summary.FileWriter('log') as fw:
    sess.run(init_op)
    fw.add_summary(sess.run(out_img_sum, feed_dict={x: np.random.rand(10, n_steps, n_inputs)}))

You would get a visualization that could look like this:

Here the brighter pixels would represent a stronger activation, so even if it is hard to tell what exactly is causing what you can at least see if any meaningful patterns arise.

回答2:

Your RNN output has the wrong shape for tf.summary.image. The tensor should be four-dimensional with the dimensions' sizes given by [batch_size, height, width, channels].

In your code, you're calling tf.summary.image with rnn_outputs, which has shape [55413, 4, 100]. Assuming your images are 55413-by-100 pixels in size and that each pixel contains 4 channels (RGBA), I'd use tf.reshape to reshape rnn_outputs to [1, 55413, 100, 4]. Then you should be able to call tf.summary.image without error.

I don't think I can help you visualize the RNN's operation, but when I was learning about RNNs and LSTMs, I found this article very helpful.