I'm just starting out with Tensorflow. As I understand it, SkFlow is a...
Simplified interface for TensorFlow
And for me simple is good.
TensorFlow's Github has some useful starter examples using the Iris dataset included in SkFlow. This is from the first example, the Linear Classifier.
iris = datasets.load_iris()
feature_columns = learn.infer_real_valued_columns_from_input(iris.data)
This iris
object has the type <class 'sklearn.datasets.base.Bunch'>
and is a dict like structure containing two lists and the data and the targets.
This link shows how to load data from a CSV (or at least a URL). At the top of the page it shows how to load via the method above, and then via the URL, like so
# Load the Pima Indians diabetes dataset from CSV URL
import numpy as np
import urllib
# URL REMOVED - SO DOES NOT LIKE SHORTENED URL
# URL for the Pima Indians Diabetes dataset
raw_data = urllib.urlopen(url)
# load the CSV file as a numpy matrix
dataset = np.loadtxt(raw_data, delimiter=",")
print(dataset.shape)
# separate the data from the target attributes
X = dataset[:,0:7]
y = dataset[:,8]
I get that X is the data, and y is the target. But that's not the structure of the data in the github example, or in the first example of the guide.
Am I meant to turn the CSV data into a single object as in
iris = datasets.load_iris()
Or do I work with the X
and y
outputs? And if so, how do I do that with the Linear Classifier example on Github
I was working on the same tutorial. I used scikit learn's cross_validation method to break the scikit Bunch object into train/test splits. Then just use those in the classifier.fit and classifier.evaluate methods.