I have a problem dealing with time series in R.
#--------------read data
wb = loadWorkbook("Countries_Europe_Prices.xlsx")
df = readWorksheet(wb, sheet="Sheet2")
x <- df$Year
y <- df$Index1
y <- lag(y, 1, na.pad = TRUE)
cbind(x, y)
It gives me the following output:
x y
[1,] 1974 NA
[2,] 1975 50.8
[3,] 1976 51.9
[4,] 1977 54.8
[5,] 1978 58.8
[6,] 1979 64.0
[7,] 1980 68.8
[8,] 1981 73.6
[9,] 1982 74.3
[10,] 1983 74.5
[11,] 1984 72.9
[12,] 1985 72.1
[13,] 1986 72.3
[14,] 1987 71.7
[15,] 1988 72.9
[16,] 1989 75.3
[17,] 1990 81.2
[18,] 1991 84.3
[19,] 1992 87.2
[20,] 1993 90.1
But I want the first value in y to be 50.8 and so forth. In other words, I want to get a negative lag. I don't get it, how can I do it?
My problem is very similar to this problem, but however I cannot solve it. I guess I still do not understand the solution(s)...
The opposite of lag() function is lead()
Simpler solution:
How about the built-in 'lead' function? (from the dplyr package) Doesn't it do exactly the job of Ahmed's function?
If you want to be able to calculate either positive or negative lags in the same function, i suggest a 'shorter' version of his 'shift' function:
What it does is creating 2 cases, one with lag the other with lead, and chooses one case depending on the sign of your lag (the +1.5 is a trick to transform a {-1, +1} into a {1, 2} alternative).
There is an easier way of doing this which I have captured fully from this link. What I will do here is explaining what should you do in steps:
First create the following function by running the following code:
This will create a function called
shift
with two arguments; one is the vector you need to operate its lag/lead and the other is number of lags/leads you need.Example:
Suppose you have the following vector:
if you need
x
's first order lagif you need
x
's first order lead (negative lag)