I am stuck trying to solve this problem:
I have a set of data points, that correspond to a set of time values. i.e. values =[1,2,3,4,5,6,7,8,4] times = [0.1,0.2,0.3,0.4]...
and so on, it's a sample rate of 10hz.
I need to find the time spent between two limits. so for example if my limits are 3 and 5 inclusive then 3,4,5,4 are in my limits.
if I calculate the time as number of (points -1) /sample rate
, or the start time and end time of the grouping, I will miss out the isolated data point (the second 4)
I have proposed to use an imaginary buffer of half the sample rate either side to model the isolated events.
but I am having trouble implementing this in code (python) I need a way of iterating through the points, counting the number of points, with a pairing to be able to perform (number of points-1)/ sample rate. and also how to pick up the singular points and apply the buffer value to the number of points.
I can't seem to find the right combination of if
and while
statements to do this.
Here's a function that does what you want. Runs of multiple data points that are within the specified limits are given a time value equal to the number of data points times the sampling period (i.e. the reciprocal of the sampling frequency), isolated single points are given a value of half the sampling period.
output
To check that this code really does what you want you can put some print statements into
estimate_time()
to show the contents ofin_range
andruns
.One thing you can do to reduce memory requirements is to convert the list comprehensions into generator expressions. List comprehensions have to create a whole new list in memory (which is deleted once it goes out of scope); a generator expression is a little slower, but it doesn't need to build a list - results are generated as they're needed. The syntax is very similar - just replace the square brackets of the list comp with round brackets to turn it into a gen exp.
So change
in_range = [lo_lim <= v <= hi_lim for v in values]
to
in_range = (lo_lim <= v <= hi_lim for v in values)
and
runs = [sum(1 for _ in group) for v, group in groupby(in_range) if v]
to
runs = (sum(1 for _ in group) for v, group in groupby(in_range) if v)