R script: While using shift function in data.table

2019-07-09 08:52发布

I have data.table

set.seed(1)
dat <- data.table(Shift = c(c(0,0,0,1,2,1,1)),
              Value = rnorm(7),I.Value = rnorm(7))
dat
Shift      Value    I.Value
 0       -0.6264538  0.7383247
 0        0.1836433  0.5757814
 0       -0.8356286 -0.3053884
 1        1.5952808  1.5117812
 2        0.3295078  0.3898432
 1       -0.8204684 -0.6212406
 1        0.4874291 -2.2146999

I want the new column to be shift(Value,Shift,fill=0). Hence the result should be-

  Shift      Value    I.Value    new.value   new.I.value
   0       -0.6264538  0.7383247  -0.6264538  0.7383247
   0        0.1836433  0.5757814   0.1836433  0.5757814
   0       -0.8356286 -0.3053884   -0.8356286 -0.3053884
   1        1.5952808  1.5117812    0           0
   2        0.3295078  0.3898432   1.5952808  1.5117812
   1       -0.8204684 -0.6212406    0           0
   1        0.4874291 -2.2146999   0.3295078  0.3898432

What I have till now:

dat[,`:=` (new.value= shift(Value,Shift,fill = 0),
         new.I.value=shift(I.Value,Shift,fill = 0))]

But the shift function is returning a list as the parameter Shift is not of length 1, but the entire vector. Hence I face the error:

Error in `[.data.table`(dat[Shift == 0, `:=`(new = Value)], !Shift ==  : 
(list) object cannot be coerced to type 'double'

I tried passing Shift[.I], but did not work either.

How do i pass just the row value of the column Shift, such that a vector is returned and I get the proper result? As my working dataset is pretty huge, I prefer working with data.table.


Edit:

-The output generated by the above function is -

   Shift       Value    I.Value grp       new2       new3
1:     0 -1.22461261  1.7672873   1 -1.2246126  1.7672873
2:     0 -0.47340064  0.7167075   1 -0.4734006  0.7167075
3:     0 -0.62036668  0.9101742   1 -0.6203667  0.9101742
4:     1  0.04211587  0.3841854   2  0.0000000  0.0000000
5:     2 -0.91092165  1.6821761   3  0.0000000  0.0000000
6:     1  0.15802877 -0.6357365   4  0.0000000  0.0000000
7:     1 -0.65458464 -0.4616447   4  0.1580288 -0.6357365

4th row should have been shifted to 5th as dat$Shift[4]==1 and 5th row should have been shifted to 7th as dat$Shift[5]==2. If we simply write-

dat[,new:=shift(Value,2,fill=0),new.I:=shift(I.Value,2,fill=0)]

will shift all rows down by 2 as n=2, just instead of a hardcoded value, n has to be the value present in the column 'Shift' for current row under process.

标签: r data.table lag
1条回答
成全新的幸福
2楼-- · 2019-07-09 09:33

Would this work ?

Note that I give priority to the value in row 5, as the value in row 6 would be shifted to the same position.

shift2 <- function(val,shift,fill){
  new_val <- rep(fill,length(val))
  indices <- (1:length(val))+shift
  indices <- indices[indices <= nrow(dat)]
  new_val[rev(indices)]= val[length(indices):1]
  new_val
}

dat[,`:=` (new.value= shift2(Value,Shift,fill = 0),
           new.I.value=shift2(I.Value,Shift,fill = 0))]

# > dat
#    Shift      Value    I.Value  new.value new.I.value
# 1:     0 -0.6264538  0.7383247 -0.6264538   0.7383247
# 2:     0  0.1836433  0.5757814  0.1836433   0.5757814
# 3:     0 -0.8356286 -0.3053884 -0.8356286  -0.3053884
# 4:     1  1.5952808  1.5117812  0.0000000   0.0000000
# 5:     2  0.3295078  0.3898432  1.5952808   1.5117812
# 6:     1 -0.8204684 -0.6212406  0.0000000   0.0000000
# 7:     1  0.4874291 -2.2146999  0.3295078   0.3898432

If you want to prioritize row 6 over row 5 :

shift3 <- function(val,shift,fill){
  new_val <- rep(fill,length(val))
  indices <- (1:length(val))+shift
  indices <- indices[indices <= nrow(dat)]
  new_val[indices]= val[1:length(indices)]
  new_val
}

dat[,`:=` (new.value= shift3(Value,Shift,fill = 0),
          new.I.value=shift3(I.Value,Shift,fill = 0))]
# 
# Shift      Value    I.Value  new.value new.I.value
# 1:     0 -0.6264538  0.7383247 -0.6264538   0.7383247
# 2:     0  0.1836433  0.5757814  0.1836433   0.5757814
# 3:     0 -0.8356286 -0.3053884 -0.8356286  -0.3053884
# 4:     1  1.5952808  1.5117812  0.0000000   0.0000000
# 5:     2  0.3295078  0.3898432  1.5952808   1.5117812
# 6:     1 -0.8204684 -0.6212406  0.0000000   0.0000000
# 7:     1  0.4874291 -2.2146999 -0.8204684  -0.6212406
查看更多
登录 后发表回答