I was trying the following code and found that StandardScaler(or MinMaxScaler)
and Normalizer
from sklearn
handle data very differently. This issue makes the pipeline construction more difficult. I was wondering if this design discrepancy is intentional or not.
from sklearn.preprocessing import StandardScaler, Normalizer, MinMaxScaler
For Normalizer
, the data is read "horizontally".
Normalizer(norm = 'max').fit_transform([[ 1., 1., 2., 10],
[ 2., 0., 0., 100],
[ 0., -1., -1., 1000]])
#array([[ 0.1 , 0.1 , 0.2 , 1. ],
# [ 0.02 , 0. , 0. , 1. ],
# [ 0. , -0.001, -0.001, 1. ]])
For StandardScaler
and MinMaxScaler
, the data is read "vertically".
StandardScaler().fit_transform([[ 1., 1., 2., 10],
[ 2., 0., 0., 100],
[ 0., -1., -1., 1000]])
#array([[ 0. , 1.22474487, 1.33630621, -0.80538727],
# [ 1.22474487, 0. , -0.26726124, -0.60404045],
# [-1.22474487, -1.22474487, -1.06904497, 1.40942772]])
MinMaxScaler().fit_transform([[ 1., 1., 2., 10],
[ 2., 0., 0., 100],
[ 0., -1., -1., 1000]])
#array([[0.5 , 1. , 1. , 0. ],
# [1. , 0.5 , 0.33333333, 0.09090909],
# [0. , 0. , 0. , 1. ]])