I want to group my DataFrame by specific column and then apply a sklearn preprocessing MinMaxScaler and store the scaler object.
My at the moment starting point:
import pandas as pd
from sklearn import preprocessing
scaler = {}
groups = df.groupby('ID')
for name, group in groups:
scr = preprocessing.MinMaxScaler()
scr.fit(group)
scaler.update({name: scr})
group = scr.transform(group)
Is this possible with df.groupby('ID').transform
?
UPDATE
From my original DataFrame
pd.DataFrame( dict( ID=list('AAABBB'),
VL=(0,10,10,100,100,200))
I want to scale all columns based on ID. In this example:
A 0.0
A 1.0
A 1.0
B 0.0
B 0.0
B 1.0
with the information / scaler object (initialized with fit)
preprocessing.MinMaxScaler().fit( ... )
you can do it in one direction:
but you will not be anle to use
inverse_transform
as each call ofMinMaxScaler
(for each group or eachID
) will overwrite the information about your orginal features...