I have a n-dimension array as shown below:
np.array([[0,3],[0,3],[0,10]])
In this array, the elements denote the low and high values. Ex: [0,3]
refers to [0,1,2,3]
I need to generate a combination of all values using the ranges given as above.
For example, I want [0,0,0], [0,0,1] ... [0,1,0] ... [3,3,10]
I have tried the following to get what I want:
ds = np.array([[0,3],[0,3],[0,10]])
nItems = int(reduce(lambda a,b: a * (b[1] - b[0] + 1), ds, 1))
myCombinations = np.zeros((nItems,))
nArrays = []
for x in range(ds.shape[0]):
low = ds[x][0]
high= ds[x][1]
nitm = high - low + 1
ar = [x+low for x in range(nitm) ]
nArrays.append(ar)
myCombinations = cartesian(nArrays)
The cartesian function was taken from Using numpy to build an array of all combinations of two arrays
I need to do this few million times.
My question: is there any better / efficient way to do this?
I think what you're looking for is np.mgrid
. Unfortunately, this returns the array in a format that's different from what you need, so you'll need to do a little post-processing:
a = np.mgrid[0:4, 0:4, 0:11] # All points in a 3D grid within the given ranges
a = np.rollaxis(a, 0, 4) # Make the 0th axis into the last axis
a = a.reshape((4 * 4 * 11, 3)) # Now you can safely reshape while preserving order
Explanation
np.mgrid
gives you a set of grid points in N-dimensional space. Let me try to show this with a smaller example, to make things clearer:
>>> a = np.mgrid[0:2, 0:2]
>>> a
array([[[0, 0],
[1, 1]],
[[0, 1],
[0, 1]]])
Since I've given two sets of ranges, 0:2, 0:2
, I get a 2D grid. What mgrid
returns is the x-values and the y-values corresponding to the grid points (0, 0), (0, 1), (1, 0) and (1, 1) in 2D space. a[0]
tells you what the x-values of the four points are, and a[1]
tells you what the y-values are.
But what you really want is that list of actual grid points that I've written out, not the x- and y-values of those points separately. First instinct is to just reshape the array as desired:
>>> a.reshape((4, 2))
array([[0, 0],
[1, 1],
[0, 1],
[0, 1]])
But clearly this doesn't work, because it effectively reshapes the flattened array (the array obtained by just reading all elements in order), and that's not what you want.
What you want to do is to look down the third dimension of a
, and create an array:
[ [a[0][0, 0], a[1][0, 0]],
[a[0][0, 1], a[1][0, 1]],
[a[0][1, 0], a[1][1, 0]],
[a[0][1, 1], a[1][1, 1]] ]
which reads "First tell me the first point (x1, y1), then the second point (x2, y2), ..." and so on. Perhaps this is better explained with a figure, of sorts. This is what a
looks like:
you want to read
in this direction
(0, 0) (0, 1)
| |
| |
v v
/ 0--------0 +----> axis0
x-values | /| /| /|
| / | / | axis1 / |
\ 1--------1 | L |
| | | | v
/ | 0-----|--1 axis2
y-values | | / | /
| |/ |/
\ 0--------1
| |
| |
v v
(1, 0) (1, 1)
np.rollaxis
gives you a way to do this. np.rollaxis(a, 0, 3)
in the above example says "take the 0th (or outermost) axis and make it into the last (or innermost) axis. (Note: only axes 0, 1 and 2 actually exist here. So saying "send the 0th axis to the 3rd position" is a way of telling python to put the 0th axis after the last axis). You might also want to read this.
>>> a = np.rollaxis(a, 0, 3)
>>> a
array([[[0, 0],
[0, 1]],
[[1, 0],
[1, 1]]])
This is starting to look like what you want, except there's an extra array dimension. We want to merge dimensions 0 and 1 to get just get a single array of grid points. But now that the flattened array reads in the manner that you expect, you can safely reshape it to give you the desired result.
>>> a = a.reshape((4, 2))
>>> a
array([[0, 0],
[0, 1],
[1, 0],
[1, 1]])
The 3D version does just the same thing, except, I couldn't make a figure for that, since it'd be in 4D.
You can use itertools.product
:
In [16]: from itertools import product
In [17]: values = list(product(range(4), range(4), range(11)))
In [18]: values[:5]
Out[18]: [(0, 0, 0), (0, 0, 1), (0, 0, 2), (0, 0, 3), (0, 0, 4)]
In [19]: values[-5:]
Out[19]: [(3, 3, 6), (3, 3, 7), (3, 3, 8), (3, 3, 9), (3, 3, 10)]
Given the array of ranges, you can do something like the following. (I used a couple non-zero low values to demonstrate the general case--and to cut down the size of the output. :)
In [41]: ranges = np.array([[0, 3], [1, 3], [8, 10]])
In [42]: list(product(*(range(lo, hi+1) for lo, hi in ranges)))
Out[42]:
[(0, 1, 8),
(0, 1, 9),
(0, 1, 10),
(0, 2, 8),
(0, 2, 9),
(0, 2, 10),
(0, 3, 8),
(0, 3, 9),
(0, 3, 10),
(1, 1, 8),
(1, 1, 9),
(1, 1, 10),
(1, 2, 8),
(1, 2, 9),
(1, 2, 10),
(1, 3, 8),
(1, 3, 9),
(1, 3, 10),
(2, 1, 8),
(2, 1, 9),
(2, 1, 10),
(2, 2, 8),
(2, 2, 9),
(2, 2, 10),
(2, 3, 8),
(2, 3, 9),
(2, 3, 10),
(3, 1, 8),
(3, 1, 9),
(3, 1, 10),
(3, 2, 8),
(3, 2, 9),
(3, 2, 10),
(3, 3, 8),
(3, 3, 9),
(3, 3, 10)]
If the low values of all the ranges are 0, you can use np.ndindex
:
In [52]: values = list(np.ndindex(4, 4, 11))
In [53]: values[:5]
Out[53]: [(0, 0, 0), (0, 0, 1), (0, 0, 2), (0, 0, 3), (0, 0, 4)]
In [54]: values[-5:]
Out[34]: [(3, 3, 6), (3, 3, 7), (3, 3, 8), (3, 3, 9), (3, 3, 10)]