I have a big 1D array of data. I have a starts
array of indexes into that data where important things happened. I want to get an array of ranges so that I get windows of length L
, one for each starting point in starts
. Bogus sample data:
data = np.linspace(0,10,50)
starts = np.array([0,10,21])
length = 5
I want to instinctively do something like
data[starts:starts+length]
But really, I need to turn starts
into 2D array of range "windows." Coming from functional languages, I would think of it as a map
from a list to a list of lists, like:
np.apply_along_axis(lambda i: np.arange(i,i+length), 0, starts)
But that won't work because apply_along_axis
only allows scalar return values.
You can do this:
pairs = np.vstack([starts, starts + length]).T
ranges = np.apply_along_axis(lambda p: np.arange(*p), 1, pairs)
data[ranges]
Or you can do it with a list comprehension:
data[np.array([np.arange(i,i+length) for i in starts])]
Or you can do it iteratively. (Bleh.)
Is there a concise, idiomatic way to slice into an array at certain start points like this? (Pardon the numpy newbie-ness.)
For a NumPy only way of doing this, you can use
numpy.meshgrid()
as described herehttp://docs.scipy.org/doc/numpy/reference/generated/numpy.meshgrid.html
As hpaulj pointed out in the comments, meshgrid actually isn't needed for this problem as you can use array broadcasting.
http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
returns
If you need to do this a lot of time, you can use
as_strided()
to create a sliding windows array ofdata
Then you can use:
to get what you want.
It's also faster than creating the index array.