Is there is a faster way to make a counter index than using a loop? Within contiguous runs of equal values, the index should be the same. I find the looping very slow especially when the data is so big.
For illustration, here is the input and desired output
x <- c(2, 3, 9, 2, 4, 4, 3, 4, 4, 5, 5, 5, 1)
Desired resulting counter:
c(1, 2, 3, 4, 5, 5, 6, 7, 7, 8, 8, 8, 9)
Note that non-contiguous runs have different indexes. E.g. see the desired indexes of the values 2
and 4
My inefficient code is this:
group[1]<-1
counter<-1
for (i in 2:n){
if (x[i]==x[i-1]){
group[i]<-counter
}else{
counter<-counter+1
group[1]<-counter}
}
Using
data.table
, which has the functionrleid()
:If you have numeric values like this, you can use
diff
andcumsum
to add up changes in valuesThis will work with numeric of character values:
You can also be a bit more efficient by calling
rle
just once (about 2x faster) and a very slight speed improvement can be made usingrep.int
instead ofrep
: