I have a numpy array where each cell of a specific row represents a value for a feature. I store all of them in an 100*4 matrix.
A B C
1000 10 0.5
765 5 0.35
800 7 0.09
Any idea how I can normalize rows of this numpy.array where each value is between 0 and 1?
My desired output is:
A B C
1 1 1
0.765 0.5 0.7
0.8 0.7 0.18(which is 0.09/0.5)
Thanks in advance :)
If I understand correctly, what you want to do is divide by the maximum value in each column. You can do this easily using broadcasting.
Starting with your example array:
x.max(0)
takes the maximum over the 0th dimension (i.e. rows). This gives you a vector of size(ncols,)
containing the maximum value in each column. You can then dividex
by this vector in order to normalize your values such that the maximum value in each column will be scaled to 1.If
x
contains negative values you would need to subtract the minimum first:Here,
x.ptp(0)
returns the "peak-to-peak" (i.e. the range, max - min) along axis 0. This normalization also guarantees that the minimum value in each column will be 0.You can use sklearn.preprocessing: