I am trying to reimplement in python an IDL function:
http://star.pst.qub.ac.uk/idl/REBIN.html
which downsizes by an integer factor a 2d array by averaging.
For example:
>>> a=np.arange(24).reshape((4,6))
>>> a
array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23]])
I would like to resize it to (2,3) by taking the mean of the relevant samples, the expected output would be:
>>> b = rebin(a, (2, 3))
>>> b
array([[ 3.5, 5.5, 7.5],
[ 15.5, 17.5, 19.5]])
i.e. b[0,0] = np.mean(a[:2,:2]), b[0,1] = np.mean(a[:2,2:4])
and so on.
I believe I should reshape to a 4 dimensional array and then take the mean on the correct slice, but could not figure out the algorithm. Would you have any hint?
I was trying to downscale a raster -- take a roughly 6000 by 2000 size raster and turn it into an arbitrarily sized smaller raster that averaged the values properly across the previous bins sizes. I found a solution using SciPy, but then I couldn't get SciPy to install on the shared hosting service I was using, so I just wrote this function instead. There is likely a better ways to do this that doesn't involve looping through the rows and columns, but this does seem to work.
The nice part about this is that the old number of rows and columns don't have to be divisible by the new number of rows and columns.
Here's an example based on the answer you've linked (for clarity):
As a function:
Here's a way of doing what you ask using matrix multiplication that doesn't require the new array dimensions to divide the old.
First we generate a row compressor matrix and a column compressor matrix (I'm sure there's a cleaner way of doing this, maybe even using numpy operations alone):
... so, for instance,
get_row_compressor(5, 3)
gives you:and
get_column_compressor(3, 2)
gives you:Then simply premultiply by the row compressor and postmultiply by the column compressor to get the compressed matrix:
Using this technique,
yields:
J.F. Sebastian has a great answer for 2D binning. Here is a version of his "rebin" function that works for N dimensions: