Starting from @AndrewGustar answer/code: Expand data.frame by creating duplicates based on group condition
1)
What about if I have the input data.frame with ID
values not in sequence and that can also duplicate theirselves?
Example data.frame:
df = read.table(text = 'ID Day Count Count_group
18 1933 6 11
33 1933 6 11
37 1933 6 11
18 1933 6 11
16 1933 6 11
11 1933 6 11
111 1932 5 8
34 1932 5 8
60 1932 5 8
88 1932 5 8
18 1932 5 8
33 1931 3 4
13 1931 3 4
56 1931 3 4
23 1930 1 1
6 1800 6 10
37 1800 6 10
98 1800 6 10
52 1800 6 10
18 1800 6 10
76 1800 6 10
55 1799 4 6
6 1799 4 6
52 1799 4 6
133 1799 4 6
112 1798 2 2
677 1798 2 2
778 888 4 6
111 888 4 6
88 888 4 6
10 888 4 6
37 887 2 3
26 887 2 3
8 886 1 2
56 885 1 1', header = TRUE)
The Count
col shows the total number of ID
values per each Day
and the Count_group
col shows the sum of the ID
values per each Day
and Day - 1
.
e.g. 1933 = Count_group
11 because Count
6 (1933) + Count
5 (1932), and so on.
What I need to do is to create duplicated observations per each Count_group
and add them to it in order to show per each Count_group
its Day
AND Day - 1
.
e.g. Count_group
= 11 is composed by the Count
values of Day
1933 and 1932. So both days needs to be included in the Count_group
= 11. The next one will be Count_group
= 8, composed by 1932 and 1931, etc...
Desired output:
ID Day Count Count_group
18 1933 6 11
33 1933 6 11
37 1933 6 11
18 1933 6 11
16 1933 6 11
11 1933 6 11
111 1932 5 11
34 1932 5 11
60 1932 5 11
88 1932 5 11
18 1932 5 11
111 1932 5 8
34 1932 5 8
60 1932 5 8
88 1932 5 8
18 1932 5 8
33 1931 3 8
13 1931 3 8
56 1931 3 8
33 1931 3 4
13 1931 3 4
56 1931 3 4
23 1930 1 4
23 1930 1 1
6 1800 6 10
37 1800 6 10
98 1800 6 10
52 1800 6 10
18 1800 6 10
76 1800 6 10
55 1799 4 10
6 1799 4 10
52 1799 4 10
133 1799 4 10
55 1799 4 6
6 1799 4 6
52 1799 4 6
133 1799 4 6
112 1798 2 6
677 1798 2 6
112 1798 2 2
677 1798 2 2
778 888 4 6
111 888 4 6
88 888 4 6
10 888 4 6
37 887 2 6
26 887 2 6
37 887 2 3
26 887 2 3
8 886 1 3
8 886 1 2
56 885 1 2
56 885 1 1
Here is a solution that keeps the ID values as above.