I have three 1D numpy arrays:
- A list of times at which some measurements occurred (
t
). - A list of measurements that occurred at each of the times in
t
(y
). - A (shorter) list of times for some some external changes that affected these measurements (
b
).
Here is an example:
t = np.array([0.33856697, 1.69615293, 1.70257872, 2.32510279, 2.37788203, 2.45102176, 2.87518307, 3.60941650, 3.78275907, 4.37970516, 4.56480259, 5.33306546, 6.00867792, 7.40217571, 7.46716989, 7.6791613 , 7.96938078, 8.41620336, 9.17116349, 10.87530965]) y = np.array([ 3.70209916, 6.31148802, 2.96578172, 3.90036915, 5.11728629, 2.85788050, 4.50077811, 4.05113322, 3.55551093, 7.58624384, 5.47249362, 5.00286872, 6.26664832, 7.08640263, 5.28350628, 7.71646500, 3.75513591, 5.72849991, 5.60717179, 3.99436659]) b = np.array([ 1.7, 3.9, 9.5])
The elements of b
fall between the bold and italicized elements t
, breaking it into four uneven sized segments of lengths 2, 7, 10, 1.
I would like to apply an operation to each segment of y
to get an array of size b.size + 1
. Specifically, I want to know if more than half of the values of y
within each segment fall above or below a certain bias.
I am currently using a for loop and slicing to apply my test:
bias = 5
categories = np.digitize(t, b)
result = np.empty(b.size + 1, dtype=np.bool_)
for i in range(result.size):
mask = (categories == i)
result[i] = (np.count_nonzero(y[mask] > bias) / np.count_nonzero(mask)) > 0.5
This seems extremely inefficient. Unfortunately, np.where
won't help much in this situation. Is there a way to vectorize the operation I describe here to avoid the Python for
loop?
By the way, here is a plot of y
vs t
, bias
, and the regions delimited by b
to show why the expected result
is array([False, False, True, False], dtype=bool)
:
Generated by
from matplotlib import pyplot as plt
from matplotlib.patches import Rectangle
plt.ion()
f, a = plt.subplots()
a.plot(t, y, label='y vs t')
a.hlines(5, *a.get_xlim(), label='bias')
plt.tight_layout()
a.set_xlim(0, 11)
c = np.concatenate([[0], b, [11]])
for i in range(len(c) - 1):
a.add_patch(Rectangle((c[i], 2.5), c[i+1] - c[i], 8 - 2.5, alpha=0.2, color=('red' if i % 2 else 'green'), zorder=-i-5))
a.legend()