I have trained the language model using Tensorflow as given in thie tutorial
For training I used the following command.
bazel-bin/tensorflow/models/rnn/ptb/ptb_word_lm --data_path=./simple-examples/data/ --model small
The training was successful with the following o/p at the end.
Epoch: 13 Train Perplexity: 37.196
Epoch: 13 Valid Perplexity: 124.502
Test Perplexity: 118.624
But I am still confused with, where the training model is stored and how to use it.
The demo code probably did not include the ability to save a model; you might want to explicitly use tf.train.Saver
to save and restore variables to and from checkpoints.
See doc and examples.
It's pretty straightforward according to the doc. In below example, I saved
all the variables in the model. Instead, you can choose which variable(s) to save by following the examples.
# ...
tf.initialize_all_variables().run()
####################################################
# Add ops to save and restore all the variables.
####################################################
saver = tf.train.Saver()
for i in range(config.max_max_epoch):
lr_decay = config.lr_decay ** max(i - config.max_epoch, 0.0)
m.assign_lr(session, config.learning_rate * lr_decay)
print("Epoch: %d Learning rate: %.3f" % (i + 1, session.run(m.lr)))
train_perplexity = run_epoch(session, m, train_data, m.train_op,
verbose=True)
print("Epoch: %d Train Perplexity: %.3f" % (i + 1, train_perplexity))
valid_perplexity = run_epoch(session, mvalid, valid_data, tf.no_op())
print("Epoch: %d Valid Perplexity: %.3f" % (i + 1, valid_perplexity))
####################################################
# Save the variables to disk.
####################################################
save_path = saver.save(session, "/tmp/model.epoch.%03d.ckpt" % (i + 1))
print("Model saved in file: %s" % save_path)
# ....
In my case, each checkpoint file has a disk size of 18.61M (--model small
).
Regarding how to use the model, just follow the doc to restore the checkpoints from saved files. Then it's at your will how to use it.