I am using the following Pandas dataframe as the training input for an SKCompat estimator:
>>> training_data.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 8709 entries, 4396 to 1889
Data columns (total 8 columns):
season 8709 non-null int64
holiday 8709 non-null int64
workingday 8709 non-null int64
weather 8709 non-null int64
temp 8709 non-null float32
atemp 8709 non-null float32
humidity 8709 non-null int64
windspeed 8709 non-null float32
At some point in the tensorflow code it passes the dataframe through the function:
tensorflow.contrib.learn.python.learn.learn_io.pandas_io.extract_pandas_data.
This seems to lose the dtype information and go back to float64
>>> x_training = extract_pandas_data(x_training)
>>> x_training.dtype
{dtype} float64
further on I then get the following exception, as the floats have been converted to float64:
TypeError: Input 'input_data' of 'TreePredictions' Op has type float64 that does not match expected type of float32.
I have seen a few examples of people using tf.cast to get around this issue, but I don't understand how to apply for my use case. What do I need to do to this Pandas DataFrame to make it work with the TensorForestEstimator?
Many thanks,
Mark
Code example, with "tf.cast" fix:
def stackoverflow_example(x_training: pd.DataFrame, y_training: pd.DataFrame):
params = tensor_forest.ForestHParams(
num_classes=1, num_features=5,
num_trees=10, max_nodes=1000)
graph_builder_class = tensor_forest.TrainingLossForest
est = estimator.SKCompat(random_forest.TensorForestEstimator(
params, graph_builder_class=graph_builder_class))
x_training = tf.cast(x_training.drop('datetime', 1), tf.float32)
est.fit(x_training, y_training, batch_size=1000)
this code returns the following exception with the cast:
ValueError: Inputs cannot be tensors. Please provide input_fn.