What is an efficient way to create a sequence of numbers that increments for each change in a group variable? As a toy example, using the data frame below, I would like a new variable, "Value", to take on the values c(1,1,1,2,2,3,3,4)
. Note that even though 48 repeats itself, "Value" still increases as I'm only concerned with a change in the sequence.
df <- read.table(textConnection(
'Group
48
48
48
56
56
48
48
14'), header = TRUE)
One way to do this is
df$Value<-1
for(i in 2:nrow(df)){
if(df[i,]$Group==df[i-1,]$Group){df[i,]$Value=df[i-1,]$Value}
else{df[i,]$Value=df[i-1,]$Value+1}
}
but this is very slow. My actual dataset has several million observations.
Note: I had a difficult time wording the title of this question so please change it if you'd like.
How about
We also could hack the
rle
.Data