This question already has an answer here:
-
Incremental IDs in a R data frame [duplicate]
2 answers
-
Perform grouping by order and value
1 answer
I'm trying to figure out if there's a way to do this that doesn't require a for loop.
I have a vector of data that increases sequentially, but skips occasional values. For example, test
num[1:4651] 2 2 2 2 3 3 3 3 3 3 7 7 9 9 9 9, etc.
Is there an R function that will convert that vector into a fixed sequence starting at 1 through the end of the vector? So,
1 1 1 1 2 2 2 2 3 3 4 4 4 4, etc.
We can use match
to do this
match(test, unique(test))
#[1] 1 1 1 1 2 2 2 2 2 2 3 3 4 4 4 4
Or another option is factor
as.integer(factor(test, levels = unique(test)))
#[1] 1 1 1 1 2 2 2 2 2 2 3 3 4 4 4 4
As @Frank suggested, dense_rank
from dplyr
may also work as the values are increasing
dplyr::dense_rank(test)
#[1] 1 1 1 1 2 2 2 2 2 2 3 3 4 4 4 4
If the values are not repeating again, possibly rleid
can be used
data.table::rleid(test)
#[1] 1 1 1 1 2 2 2 2 2 2 3 3 4 4 4 4
Or a base R
option using rle
inverse.rle(within.list(rle(test), values <- seq_along(values)))
#[1] 1 1 1 1 2 2 2 2 2 2 3 3 4 4 4 4
Or another option is
cumsum(c(TRUE, test[-1] != test[-length(test)]))
#[1] 1 1 1 1 2 2 2 2 2 2 3 3 4 4 4 4
Or with lag
from dplyr
cumsum(test != lag(test, default = TRUE))
#[1] 1 1 1 1 2 2 2 2 2 2 3 3 4 4 4 4
data
test <- c(2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 7, 7, 9, 9, 9, 9)
Using rle
and rep
in base R where vec
is your vector:
with(rle(vec), rep(seq_along(lengths), times = lengths))