可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I want to add ID variable in this data. If receipt_ids are sequenced numbers, then those have same IDs.

CUST_NO_ID  receipt_id      dollar
  12         29             20.84
  12         30             20.21
  12         86             24.50
  12         87             20.68
  12        108             25.79
  12        109             24.93
  12        125             20.63
  12        126              9.90
  19        193             69.48
  19        194             46.88

here is my desired result

CUST_NO_ID  receipt_id      dollar       ID
  12         29             20.84        1
  12         30             20.21        1
  12         86             24.50        2
  12         87             20.68        2
  12        108             25.79        3
  12        109             24.93        3
  12        110             24.93        3
  12        125             20.63        4
  12        126              9.90        4
  19        193             69.48        5
  19        194             46.88        6

回答1:

Had a similar notion to @Psidom, but he beat me to the punch with cumsum. Here's a dplyr solution. Adding in group_by can give you added flexibility if you want to restart ids by customer number.

df %>% 
  mutate(id = cumsum(c(TRUE, diff(receipt_id) != 1)))

回答2:

Assuming your data frame has already been sorted by CUST_NO_ID and receipt_id, you can use cumsum on a conditional vector where TRUE indicates the position a new ID should be created:

df$ID = cumsum(c(T, diff(df$receipt_id) != 1 | diff(df$CUST_NO_ID) != 0)))

df
#   CUST_NO_ID receipt_id dollar ID
#1          12         29  20.84  1
#2          12         30  20.21  1
#3          12         86  24.50  2
#4          12         87  20.68  2
#5          12        108  25.79  3
#6          12        109  24.93  3
#7          12        125  20.63  4
#8          12        126   9.90  4
#9          19        193  69.48  5
#10         19        194  46.88  5

回答3:

This does it

id <- 1

for(row in 1:nrow(data)){
  if(row == 1){
    dif <- 1
  }else{
    dif <- data[row,'receipt_id'] - data[row-1,'receipt_id']
  }

  if(dif != 1){
    id = id + 1
  }

  data[row,'ID'] = id
}

回答4:

We can use data.table

library(data.table)
setDT(df)[, id := cumsum(c(TRUE, diff(receipt_id)!=1))]

Or use the shift

setDT(df)[, id := cumsum((receipt_id - shift(receipt_id, fill=receipt_id[1]))!=1)]

How to create id variable by grouping sequenced nu

问题:

回答1:

回答2:

回答3:

回答4:

收藏的人(0)

How to create id variable by grouping sequenced nu

问题:

回答1:

回答2:

回答3:

回答4:

收藏的人(0)

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮