I have what seems to be a simple problem which I haven't been able to solve. I have an R data frame which consists of a single column of data points, as show below. I would like to subset into a new data frame which contains data points based on value of previous data points.
So below, I would for example like to subset all the rows where the previous value was greater than .04. Any ideas would be appreciated. Thank you.
Price
[1,] -0.006666667
[2,] 0.040268456
[3,] 0.051612903
[4,] -0.006134969
[5,] 0.006172840
[6,] 0.006134969
[7,] 0.030487805
Like this:
x[c(FALSE, head(x$Price, -1) > 0.04), , drop = FALSE]
(From your print, it seems your object might be a matrix, not a data.frame. If it is the case, replace x$Price
with x[, "Price"]
.)
These types of manipulations can be done in a way which directly mimics our thought process by using a time series representation. This also has the advantage that its now in such a representation and that will facilitate further computations as well. Suppose DF
is the data frame. Convert it to a zoo object z
and then extract those components of z
whose lag exceeds 0.04
:
> library(zoo)
> z <- zoo(DF$Price)
> z[lag(z, -1) > 0.04]
3 4
0.051612903 -0.006134969
If result
is the value of the last line of code then time(result)
gives the times (3
and 4
in the above example) and coredata(result)
gives the data values.