Lagging Variables in R

2020-02-17 04:04发布

What is the most efficient way to make a matrix of lagged variables in R for an arbitrary variable (i.e. not a regular time series)

For example:

Input:

x <- c(1,2,3,4) 

2 lags, output:

[1,NA, NA]
[2, 1, NA]
[3, 2,  1]
[4, 3,  2]

标签: r time-series
4条回答
迷人小祖宗
2楼-- · 2020-02-17 04:41

The running function in the gtools package does more or less what you want:

> require("gtools")
> running(1:4, fun=I, width=3, allow.fewer=TRUE)

$`1:1`
[1] 1

$`1:2` 
[1] 1 2

$`1:3` 
[1] 1 2 3

$`2:4` 
[1] 2 3 4
查看更多
干净又极端
3楼-- · 2020-02-17 04:45

The method that works best for me is to use the lag function from the dplyr package.

Example:

> require(dplyr)
> lag(1:10, 1)
 [1] NA  1  2  3  4  5  6  7  8  9
> lag(1:10, 2)
 [1] NA NA  1  2  3  4  5  6  7  8
查看更多
叼着烟拽天下
4楼-- · 2020-02-17 04:51

You can achieve this using the built-in embed() function, where its second 'dimension' argument is equivalent to what you've called 'lag':

x <- c(NA,NA,1,2,3,4)
embed(x,3)

## returns
     [,1] [,2] [,3]
[1,]    1   NA   NA
[2,]    2    1   NA
[3,]    3    2    1
[4,]    4    3    2

embed() was discussed in a previous answer by Joshua Reich. (Note that I prepended x with NAs to replicate your desired output).

It's not particularly well-named but it is quite useful and powerful for operations involving sliding windows, such as rolling sums and moving averages.

查看更多
三岁会撩人
5楼-- · 2020-02-17 04:55

Use a proper class for your objects; base R has ts which has a lag() function to operate on. Note that these ts objects came from a time when 'delta' or 'frequency' where constant: monthly or quarterly data as in macroeconomic series.

For irregular data such as (business-)daily, use the zoo or xts packages which can also deal (very well!) with lags. To go further from there, you can use packages like dynlm or dlm allow for dynamic regression models with lags.

The Task Views on Time Series, Econometrics, Finance all have further pointers.

查看更多
登录 后发表回答