My goal is to make another column by summing the observation from the present day and all previous observations from the same ID by using the date (the data set is sorted in date and chr nr(ID). I will need the aggregation to start over when a new "id" is presented.
there might be som NA's, they should be considered as null
"Doseringer_pr_kg_dyr" is the observation.
CHR_NR DATO_AFSLUT Doseringer_pr_kg_dyr brugstid
10358 2018-08-06 29416.67 31
10358 2018-09-06 104682.27 36
10358 2018-10-12 10333.33 26
10358 2018-11-07 10090.91 27
10358 2018-12-04 8000.00 NA
13168 2012-01-23 12042.25 2
13168 2012-01-25 9000.00 42
13168 2012-03-07 44450.70 19
13168 2012-03-26 35000.00 37
13168 2012-05-02 93478.26 70
I expect something ala:
CHR_NR DATO_AFSLUT Doseringer_pr_kg_dyr brugstid sum
10358 2018-11-07 10090.91 27 [108,6]+[109,3]
10358 2018-12-04 8000.00 NA [109,6]+[110,3]
13168 2012-01-23 12042.25 2 [111,3]
13168 2012-01-25 9000.00 42 [111,6]+[112,3]
13168 2012-03-07 44450.70 19 [112,6]+[113,3]
where [row, column] + [row, column] is summed in the new column.
i thought of one of the apply functions or a loop like:
agg<-function(dat) {
last_row <- 0
for ( row in dat ) {
if ( row[1] == last_row[1] ) {
row[6] <- last_row[6] + row[3]
} else {
row[6] <- row[3]
}
last_row <- row
}
}
From the comment:
# dput(head(a))
a <- structure(list(CHR_NR = c(10358, 10358, 10358, 10358, 10358, 10358),
DATO_AFSLUT = structure(c(15349, 15387, 15426, 15441, 15455, 15476),
Level = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real ),
Doseringer_pr_kg_dyr = c(276152.688936271, 161616.695196052, 127549.514333096, 13333.3333333333, 86255.3897180763, 31034.1151385928 ),
brugstid = c(38, 39, 15, 14, 21, 15),
i = c(7267.17602463871, 4144.01782553979, 8503.30095553976, 952.380952380952, 4107.39951038459, 2068.94100923952)),
row.names = 6:11, class = "data.frame")