After creating a key on a data.table:
set.seed(12345)
DT <- data.table(x = sample(LETTERS[1:3], 10, replace = TRUE),
y = sample(LETTERS[1:3], 10, replace = TRUE))
setkey(DT, x, y)
DT
# x y
# [1,] A B
# [2,] A B
# [3,] B B
# [4,] B B
# [5,] C A
# [6,] C A
# [7,] C A
# [8,] C A
# [9,] C C
# [10,] C C
I would like to get an integer vector giving for each row the corresponding "key index". I hope the expected output (column i
) below will help clarify what I mean:
# x y i
# [1,] A B 1
# [2,] A B 1
# [3,] B B 2
# [4,] B B 2
# [5,] C A 3
# [6,] C A 3
# [7,] C A 3
# [8,] C A 3
# [9,] C C 4
# [10,] C C 4
I thought about using something like cumsum(!duplicated(DT[, key(DT), with = FALSE]))
but am hoping there is a better solution. I feel this vector could be part of the table's internal representation, and maybe there is a way to access it? Even if it is not the case, what would you suggest?
I'd probably just do this, since I'm fairly confident that no index counter is available from within the call to
[.data.table()
:You could make this a one-liner, at the expense of an additional call to
unique.data.table()
:Update: From
v1.8.3
, you can simply use the inbuilt special.GRP
:See history for older answers.