What is the most efficient way to make a matrix of lagged variables in R for an arbitrary variable (i.e. not a regular time series)
For example:
Input:
x <- c(1,2,3,4)
2 lags, output:
[1,NA, NA]
[2, 1, NA]
[3, 2, 1]
[4, 3, 2]
What is the most efficient way to make a matrix of lagged variables in R for an arbitrary variable (i.e. not a regular time series)
For example:
Input:
x <- c(1,2,3,4)
2 lags, output:
[1,NA, NA]
[2, 1, NA]
[3, 2, 1]
[4, 3, 2]
The
running
function in thegtools
package does more or less what you want:The method that works best for me is to use the
lag
function from thedplyr
package.Example:
You can achieve this using the built-in
embed()
function, where its second 'dimension' argument is equivalent to what you've called 'lag':embed()
was discussed in a previous answer by Joshua Reich. (Note that I prepended x with NAs to replicate your desired output).It's not particularly well-named but it is quite useful and powerful for operations involving sliding windows, such as rolling sums and moving averages.
Use a proper
class
for your objects; base R hasts
which has alag()
function to operate on. Note that thesets
objects came from a time when 'delta' or 'frequency' where constant: monthly or quarterly data as in macroeconomic series.For irregular data such as (business-)daily, use the zoo or xts packages which can also deal (very well!) with lags. To go further from there, you can use packages like dynlm or dlm allow for dynamic regression models with lags.
The Task Views on Time Series, Econometrics, Finance all have further pointers.