I have decided to learn R. I am trying to get a sense of how to write "R style" functions and to avoid looping. Here is a sample situation:
Given a vector a
, I would like to compute a vector b
whose elements b[i]
(the vector index begins at 1) are defined as follows:
1 <= i <= 4:
b[i] = NaN
5 <= i <= length(a):
b[i] = mean(a[i-4] to a[i])
Essentially, if we pretend 'a' is a list of speeds where the first entry is at time = 0, the second at time = 1 second, the third at time = 2 seconds... I would like to obtain a corresponding vector describing the average speed over the past 5 seconds.
E.g.:
If a is (1,1,1,1,1,4,6,3,6,8,9)
then b
should be (NaN, NaN, NaN, NaN, 1, 1.6, 2.6, 3, 4, 5.4, 6.4)
I could do this using a loop, but I feel that doing so would not be in "R style".
Thank you,
Tungata
You can also use a combination of
cumsum
anddiff
to get the sum over sliding windows. You'll need to pad with your ownNaN
, though:Something like
b = filter(a, rep(1.0/5, 5), sides=1)
will do the job, although you will probably get zeros in the first few slots, instead of NaN. R has a large library of built-in functions, and "R style" is to use those wherever possible. Take a look at the documentation for thefilter
function.Because these rolling functions often apply with time-series data, some of the newer and richer time-series data-handling packages already do that for you:
The zoo has excellent documentation that will show you many, many more examples, in particular how to do this with real (and possibly irregular) dates; xts extends this further but zoo is a better starting point.