I'm looking at source code from this Tensorflow article that talks about how to create a wide-and-deep learning model. https://www.tensorflow.org/versions/r1.3/tutorials/wide_and_deep
Here is the link to the python source code: https://github.com/tensorflow/tensorflow/blob/r1.3/tensorflow/examples/learn/wide_n_deep_tutorial.py
What the goal of it is, is to train a model that will predict if someone makes more or less than $50k a year given the data in the census information.
As instructed, I'm running this command to execute:
python wide_n_deep_tutorial.py --model_type=wide_n_deep
The result that I get is the following:
$ python wide_n_deep.py --model_type=wide_n_deep
Training data is downloaded to /tmp/tmp_pwqo2h8
Test data is downloaded to /tmp/tmph6jcimik
2018-01-03 05:34:12.236038: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
WARNING:tensorflow:enqueue_data was called with num_epochs and num_threads > 1. num_epochs is applied per thread, so this will produce more epochs than you probably intend. If you want to limit epochs, use one thread.
WARNING:tensorflow:enqueue_data was called with shuffle=False and num_threads > 1. This will create multiple threads, all reading the array/dataframe in order. If you want examples read in order, use one thread; if you want multiple threads, enable shuffling.
WARNING:tensorflow:Casting <dtype: 'float32'> labels to bool.
WARNING:tensorflow:Casting <dtype: 'float32'> labels to bool.
model directory = /tmp/tmp_ab6cfsf
accuracy: 0.808673
accuracy_baseline: 0.763774
auc: 0.841373
auc_precision_recall: 0.66043
average_loss: 0.418642
global_step: 2000
label/mean: 0.236226
loss: 41.8154
prediction/mean: 0.251593
In the various articles that I've seen online, it talks about loading in a .ckpt
file. When I look in my model directory I see these files:
$ ls /tmp/tmp_ab6cfsf
checkpoint eval events.out.tfevents.1514957651.ml-1 graph.pbtxt model.ckpt-1.data-00000-of-00001 model.ckpt-1.index model.ckpt-1.meta model.ckpt-2000.data-00000-of-00001 model.ckpt-2000.index model.ckpt-2000.meta
I'm guessing the one that I would be using is model.ckpt-1.meta
, is that correct?
But I'm also confused on how to use and feed this model data. I've looked at this article on Tensorflow's website: https://www.tensorflow.org/versions/r1.3/programmers_guide/saved_model
Which says "Note that Estimators automatically saves and restores variables (in the model_dir)." (not sure what that means in this context)
How can I generate information in the format of the census data, except the salary since that is what we are supposed to be predicting? It's not obvious to me how to use the two Tensorflow articles in order to be able to use the trained model in order to make predictions.
You can look at the official blog posts (part 1 and part 3) from the TensorFlow team that explains well how to use an estimator.
In particular they explain how to make predictions using a custom input. This uses the built-in
predict
method of Estimators:For your example, we can create a predict input function using an additional csv file. Let's suppose we have a csv file called
"predict.csv"
containing three examples (could be the first three lines of"test.csv"
for instance without the labels). This would give:predict.csv
: