Loops in R - Need to use index, anyway to avoid &#

I know it's not the best practice in R to use the for loop because it doesn't have an enhanced performance. For almost all cases there is a function of the family *apply that solves our problems.

However I'm facing a situation where I don't see a workaround.

I need to calculate percent variation for consecutive values:

pv[1] <- 0
for(i in 2:length(x)) {
  pv[i] <- (x[i] - x[i-1])/x[i-1]
}

So, as you can see, I have to use both the x[i] element, but also the x[i-1] element. By using the *apply functions, I just see how to use the x[i]. Is there anyway I can avoid the forloops?

标签： r loops for-loop

3条回答

干净又极端

2楼-- · 2020-02-19 09:20

What you offered would be the fractional variation, but if you multiplied by 100 you get the "percent variation":

pv<- vector("numeric",length(x))
pv[1] <- 0
pv[-1] <- 100* ( x[-1] - x[-length(x)] )/ x[-length(x)]

Vectorized solution. ( And you should note that for-loops are going to be just as slow as *apply solutions ... just not as pretty. Always look for a vectorized approach.)

To explain a bit more: The x[-length(x)] is the vector, x[1:(length{x-1)], and the x[-1] is the vector, x[2:length(x)], and the vector operations in R are doing the same operations as in your for-loop body, although not using an explicit loop. R first constructs the differences in those shifted vectors, x[-length(x)] - x[-1], and then divides by x[1:(length{x-1)].

0人赞添加讨论(0) 举报

趁早两清

3楼-- · 2020-02-19 09:33

You can get the same results with:

pv <- c(0)
y <- sapply(2:length(x), function(i) {pv <<- (x[i] - x[i-1])/x[i-1]})
c(0, y)

The for loop issues that once were a problem have been optimized. Often a for loop is not slower and may even be faster than the apply solution. You have to test them both and see. I'm betting your for loop is faster than my solution.

EDIT: Just to illustrate the for loop vs. apply solution as well as what DWin discusses about vectorization I ran the benchmarking on the four solutions using microbenchmark on a win 7 machine.

Unit: microseconds
             expr     min      lq  median      uq       max
1    DIFF_Vincent  22.396  25.195  27.061  29.860  2073.848
2        FOR.LOOP 132.037 137.168 139.968 144.634 56696.989
3          SAPPLY 146.033 152.099 155.365 162.363  2321.590
4 VECTORIZED_Dwin  18.196  20.063  21.463  23.328   536.075

enter image description here

0人赞添加讨论(0) 举报

唯我独甜

4楼-- · 2020-02-19 09:34

You can also use diff:

c( 0, diff(x) / x[-length(x)] )
c( 0, exp(diff(log(x))) - 1 )

0人赞添加讨论(0) 举报

Loops in R - Need to use index, anyway to avoid &#

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间