There is already some part of the question answered here special-group-number-for-each-combination-of-data. In most cases we have pairs and other data values inside the data. What we want to achieve is that number those groups if those pairs exist and number them until the next pairs.
As I concentrated each pairs such as c("bad","good")
would like to group them and for pairs c('Veni',"vidi","Vici")
assign unique number 666
.
Here is the example data
names <- c(c("bad","good"),1,2,c("good","bad"),111,c("bad","J.James"),c("good","J.James"),333,c("J.James","good"),761,'Veni',"vidi","Vici")
df <- data.frame(names)
Here is the real and general case expected output
names Group
1 bad 1
2 good 1
3 1 1
4 2 1
5 good 2
6 bad 2
7 111 2
8 bad 3
9 J.James 3
10 good 4
11 J.James 4
12 333 4
13 J.James 5
14 good 5
15 761 5
16 Veni 666
17 vidi 666
18 Vici 666
Here are two approaches which reproduce OP's expected result for the given sample dataset.`
Both work in the same way. First, all "disturbing" rows, i.e., rows which do not contain "valid" names, are skipped and the rows with "valid" names are simply numbered in groups of 2. Second, the rows with exempt names are given the special group number. Finally, the
NA
rows are filled by carrying the last observation forward.data.table
dplyr
/tidyr
Here is my attempt to provide a
dplyr
/tidyr
solution: