how to avoid loops

2019-06-20 00:23发布

HI All, I'm new to R.

I have two panel data files, with columns "id", "date" and "ret"

file A has a lot more data than file B, but i'm primarily working with file B data.

Combination of "id" and "date" is unqiue indentifier.

Is there an elegent way of looking up for each (id, date) in B, I need to get the past 10 days ret from file A, and store them back into B?

my naive way of doing it is to loop for all rows in B,

for i in 1:length(B) {
    B$past10d[i] <- prod(1+A$ret[which(A$id == B$id[i] & A$date > B$date[i]-10 & A$date < B$date[i])])-1
}

but the loops takes forever.

Really appreciate your thoughts.

Thank you very much.

7条回答
家丑人穷心不美
2楼-- · 2019-06-20 00:54

In general, you ought to avoid looping in R. It's much quicker if your code operates on vectors.

I would use merge, as suggested by ran2. You can set all.x = T (or all.y or all) to get all the rows from one (or other or both) - data frames. This is quick and will typically work-out which fields to match by itself. Otherwise you'll need to specify by.x (and by.y or by) as a lookup field. By the sounds of it you may need to create this field yourself (as per John's comment).

You can then filter by date.

查看更多
登录 后发表回答