Counting data points within limits, and applying b

2019-08-08 10:45发布


I am stuck trying to solve this problem:

I have a set of data points, that correspond to a set of time values. i.e. values =[1,2,3,4,5,6,7,8,4] times = [0.1,0.2,0.3,0.4]... and so on, it's a sample rate of 10hz.

I need to find the time spent between two limits. so for example if my limits are 3 and 5 inclusive then 3,4,5,4 are in my limits.

if I calculate the time as number of (points -1) /sample rate, or the start time and end time of the grouping, I will miss out the isolated data point (the second 4)

I have proposed to use an imaginary buffer of half the sample rate either side to model the isolated events.

but I am having trouble implementing this in code (python) I need a way of iterating through the points, counting the number of points, with a pairing to be able to perform (number of points-1)/ sample rate. and also how to pick up the singular points and apply the buffer value to the number of points.

I can't seem to find the right combination of if and while statements to do this.


Here's a function that does what you want. Runs of multiple data points that are within the specified limits are given a time value equal to the number of data points times the sampling period (i.e. the reciprocal of the sampling frequency), isolated single points are given a value of half the sampling period.

#!/usr/bin/env python

''' Estimate time of data points falling within specified limits 
    Written 2015.04.03 by PM 2Ring,
    with help from Antti Haapala and Martijn Pieters 

from itertools import groupby

def estimate_time(values, lo_lim, hi_lim, sample_rate):
    #Find values that are in range
    in_range = [lo_lim <= v <= hi_lim for v in values]

    #Find runs of in-range values
    runs = [sum(1 for _ in group) for v, group in groupby(in_range) if v]

    #Estimate total time spent in-range
    total_time = sum(v if v > 1 else 0.5 for v in runs)
    return total_time / sample_rate

values = [1, 2, 3, 4, 5, 6, 7, 8, 4]
sample_rate = 10.0  # in Hz

lo_lim = 3
hi_lim = 5

print estimate_time(values, lo_lim, hi_lim, sample_rate)



To check that this code really does what you want you can put some print statements into estimate_time() to show the contents of in_range and runs.

One thing you can do to reduce memory requirements is to convert the list comprehensions into generator expressions. List comprehensions have to create a whole new list in memory (which is deleted once it goes out of scope); a generator expression is a little slower, but it doesn't need to build a list - results are generated as they're needed. The syntax is very similar - just replace the square brackets of the list comp with round brackets to turn it into a gen exp.

So change

in_range = [lo_lim <= v <= hi_lim for v in values]
in_range = (lo_lim <= v <= hi_lim for v in values)


runs = [sum(1 for _ in group) for v, group in groupby(in_range) if v]
runs = (sum(1 for _ in group) for v, group in groupby(in_range) if v)