I've create a pipeline as follows (using the Keras Scikit-Learn API)
estimators = []
estimators.append(('standardize', StandardScaler()))
estimators.append(('mlp', KerasRegressor(build_fn=baseline_model, nb_epoch=50, batch_size=5, verbose=0)))
pipeline = Pipeline(estimators)
and fit it with
pipeline.fit(trainX,trainY)
If I predict with pipline.predict(testX)
, I (believe) I get standardised predictions.
How do I predict on testX
so that predictedY
it at the same scale as the actual (untouched) testY
(i.e. NOT standardised prediction, but instead the actual values)? I see there is an inverse_transform
method for Pipeline, however appears to be for only reverting a transformed X
.
Exactly. The StandardScaler() in a pipeline is only mapping the inputs (trainX) of pipeline.fit(trainX,trainY).
So, if you fit your model to approximate trainY and you need it to be standardized as well, you should map your trainY as
scalerY = StandardScaler().fit(trainY) # fit y scaler
pipeline.fit(trainX, scalerY.transform(trainY)) # fit your pipeline to scaled Y
testY = scalerY.inverse_transform(pipeline.predict(testX)) # predict and rescale
The inverse_transform() function maps its values considering the standard deviation and mean calculated in StandardScaler().fit().
You can always fit your model without scaling Y, as you mentioned, but this can be dangerous depending on your data since it can lead your model to overfit. You have to test it ;)