After doing some processing on an audio or image array, it needs to be normalized within a range before it can be written back to a file. This can be done like so:
# Normalize audio channels to between -1.0 and +1.0
audio[:,0] = audio[:,0]/abs(audio[:,0]).max()
audio[:,1] = audio[:,1]/abs(audio[:,1]).max()
# Normalize image to between 0 and 255
image = image/(image.max()/255.0)
Is there a less verbose, convenience function way to do this? matplotlib.colors.Normalize()
doesn't seem to be related.
You can also rescale using
sklearn
. The advantages are that you can adjust normalize the standard deviation, in addition to mean-centering the data, and that you can do this on either axis, by features, or by records.The keyword arguments
axis
,with_mean
,with_std
are self explanatory, and are shown in their default state. The argumentcopy
performs the operation in-place if it is set toFalse
. Documentation here.Using
/=
and*=
allows you to eliminate an intermediate temporary array, thus saving some memory. Multiplication is less expensive than division, sois marginally faster than
Since we are using basic numpy methods here, I think this is about as efficient a solution in numpy as can be.
A simple solution is using the scalers offered by the sklearn.preprocessing library.
The error X_rec-X will be zero. You can adjust the feature_range for your needs, or even use a standart scaler sk.StandardScaler()
I tried following this, and got the error
The
numpy
array I was trying to normalize was aninteger
array. It seems they deprecated type casting in versions >1.10
, and you have to usenumpy.true_divide()
to resolve that.img
was anPIL.Image
object.You can use the "i" (as in idiv, imul..) version, and it doesn't look half bad:
For the other case you can write a function to normalize an n-dimensional array by colums:
If the array contains both positive and negative data, I'd go with:
also, worth mentioning even if it's not OP's question, standardization: