The newest Tensorflow api about seq2seq model has included scheduled sampling:
https://www.tensorflow.org/api_docs/python/tf/contrib/seq2seq/ScheduledEmbeddingTrainingHelper https://www.tensorflow.org/api_docs/python/tf/contrib/seq2seq/ScheduledOutputTrainingHelper
The original paper of scheduled sampling can be found here: https://arxiv.org/abs/1506.03099
I read the paper but I cannot understand the difference between ScheduledEmbeddingTrainingHelper
and ScheduledOutputTrainingHelper
. The documentation only says ScheduledEmbeddingTrainingHelper
is a training helper that adds scheduled sampling while ScheduledOutputTrainingHelper
is a training helper that adds scheduled sampling directly to outputs.
I wonder what's the difference between these two helpers?
I contacted the engineer behind this, and he responded:
This might also help you. This is for the case where you want to do scheduled sampling at each decoding step separately.
I used the tensorflow documentation code to make this example. https://github.com/tensorflow/tensorflow/blob/r1.5/tensorflow/contrib/seq2seq/python/ops/helper.py
Here's a basic example of using
ScheduledEmbeddingTrainingHelper
, using TensorFlow 1.3 and some higher level tf.contrib APIs. It's a sequence2sequence model, where the decoder's initial hidden state is the final hidden state of the encoder. It shows only how to train on a single batch (and apparently the task is "reverse this sequence"). For actual training tasks, I suggest looking at tf.contrib.learn APIs such as learn_runner, Experiment and tf.estimator.Estimator.For
ScheduledOutputTrainingHelper
, I would expect to just swap out the helper and use:However this gives an error, since the LSTM cell expects a multidimensional input per timestep (of shape (batch_size, input_dims)). I will raise an issue in GitHub to find out if this is a bug, or there's some other way to use
ScheduledOutputTrainingHelper
.