I have data.table
set.seed(1)
dat <- data.table(Shift = c(c(0,0,0,1,2,1,1)),
Value = rnorm(7),I.Value = rnorm(7))
dat
Shift Value I.Value
0 -0.6264538 0.7383247
0 0.1836433 0.5757814
0 -0.8356286 -0.3053884
1 1.5952808 1.5117812
2 0.3295078 0.3898432
1 -0.8204684 -0.6212406
1 0.4874291 -2.2146999
I want the new column to be shift(Value,Shift,fill=0). Hence the result should be-
Shift Value I.Value new.value new.I.value
0 -0.6264538 0.7383247 -0.6264538 0.7383247
0 0.1836433 0.5757814 0.1836433 0.5757814
0 -0.8356286 -0.3053884 -0.8356286 -0.3053884
1 1.5952808 1.5117812 0 0
2 0.3295078 0.3898432 1.5952808 1.5117812
1 -0.8204684 -0.6212406 0 0
1 0.4874291 -2.2146999 0.3295078 0.3898432
What I have till now:
dat[,`:=` (new.value= shift(Value,Shift,fill = 0),
new.I.value=shift(I.Value,Shift,fill = 0))]
But the shift function is returning a list as the parameter Shift is not of length 1, but the entire vector. Hence I face the error:
Error in `[.data.table`(dat[Shift == 0, `:=`(new = Value)], !Shift == :
(list) object cannot be coerced to type 'double'
I tried passing Shift[.I], but did not work either.
How do i pass just the row value of the column Shift, such that a vector is returned and I get the proper result? As my working dataset is pretty huge, I prefer working with data.table.
Edit:
-The output generated by the above function is -
Shift Value I.Value grp new2 new3
1: 0 -1.22461261 1.7672873 1 -1.2246126 1.7672873
2: 0 -0.47340064 0.7167075 1 -0.4734006 0.7167075
3: 0 -0.62036668 0.9101742 1 -0.6203667 0.9101742
4: 1 0.04211587 0.3841854 2 0.0000000 0.0000000
5: 2 -0.91092165 1.6821761 3 0.0000000 0.0000000
6: 1 0.15802877 -0.6357365 4 0.0000000 0.0000000
7: 1 -0.65458464 -0.4616447 4 0.1580288 -0.6357365
4th row should have been shifted to 5th as dat$Shift[4]==1
and 5th row should have been shifted to 7th as dat$Shift[5]==2
. If we simply write-
dat[,new:=shift(Value,2,fill=0),new.I:=shift(I.Value,2,fill=0)]
will shift all rows down by 2 as n=2, just instead of a hardcoded value, n has to be the value present in the column 'Shift' for current row under process.
Would this work ?
Note that I give priority to the value in row 5, as the value in row 6 would be shifted to the same position.
If you want to prioritize row 6 over row 5 :