Create a sequence of numbers that increments for e

2020-04-18 04:33发布

What is an efficient way to create a sequence of numbers that increments for each change in a group variable? As a toy example, using the data frame below, I would like a new variable, "Value", to take on the values c(1,1,1,2,2,3,3,4). Note that even though 48 repeats itself, "Value" still increases as I'm only concerned with a change in the sequence.

df <- read.table(textConnection(
  'Group 
  48 
  48
  48
  56
  56
  48
  48
  14'), header = TRUE)

One way to do this is

df$Value<-1
for(i in 2:nrow(df)){
if(df[i,]$Group==df[i-1,]$Group){df[i,]$Value=df[i-1,]$Value}
else{df[i,]$Value=df[i-1,]$Value+1}
}

but this is very slow. My actual dataset has several million observations.

Note: I had a difficult time wording the title of this question so please change it if you'd like.

标签: r
2条回答
ら.Afraid
2楼-- · 2020-04-18 04:53

How about

library(tidyverse)
df = data.frame(Group = c(48, 
                      48,
                      48,
                      56,
                      56,
                      48,
                      48,
                      14))

# Get unique values in group
unique_vals = unique(df$Group)

# create a sequence from 1 up until the length of the unique values vector
sequential_nums = 1:length(unique_vals)

# Create a new column looking up the current value in the unique_vals list
# and replacing it with the correct sequential number
df %>% 
  mutate(Value = sequential_nums[match(Group, unique_vals)])

# Group      Value 
# 1    48         1
# 2    48         1
# 3    48         1
# 4    56         2
# 5    56         2
# 6    48         1
# 7    48         1
# 8    14         3
查看更多
虎瘦雄心在
3楼-- · 2020-04-18 05:08

We also could hack the rle.

r <- rle(df$Group)
r$values <- seq(r$lengths)
inverse.rle(r)
# [1] 1 1 1 2 2 3 3 4

Data

df <- structure(list(Group = c(48L, 48L, 48L, 56L, 56L, 48L, 48L, 14L
)), class = "data.frame", row.names = c(NA, -8L))
查看更多
登录 后发表回答