I have a data frame "DF" like this:
Flight.Start Flight.End Device Partner Creative Days.in.Flight
2015-08-31 2015-08-31 Standard MSN Video 35
What I need to do is "blow it up" like so:
Flight.Start Flight.End Date Device Partner Creative Days.in.Flight
2015-08-31 2015-10-04 2015-08-31 Standard MSN Video 35
2015-08-31 2015-10-04 2015-09-01 Standard MSN Video 35
2015-08-31 2015-10-04 2015-09-02 Standard MSN Video 35
2015-08-31 2015-10-04 2015-09-03 Standard MSN Video 35
2015-08-31 2015-10-04 2015-09-04 Standard MSN Video 35
2015-08-31 2015-10-04 2015-09-05 Standard MSN Video 35
2015-08-31 2015-10-04 2015-09-06 Standard MSN Video 35
2015-08-31 2015-10-04 2015-09-07 Standard MSN Video 35
ETC...... until Date variable hits 2015-10-04, then move on to next duplication
Essentially every row gets duplicated by the amount of days in flight - 1 (since the row that already exists can account for a single day in the interval, and then a new column "Date" is filled out for the relevant dates for within that flight. So if a row has a start and end date of 9/1 and 9/5 respectively, 4 duplicate rows would be appended to the one already existing, a new column would be created (Date), and the date sequence of whatever the flight start and end dates are for the original row will fill out the column values.
All date values are formatted as date, days in flight is a num, and the rest are factors.
EDIT
In response to the duplicate question flagging:
To clarify, this is NOT like the case that has been flagged as a duplicate, because my question is not really focused on how to duplicate based on days in flight (I already know how to do that!), but rather how I can then add column to that output data frame and sequentially insert dates within the corresponding flight period. Thanks for the heads up...
Here's a way to do it with base R:
Or using
data.table
, we convert the 'data.frame' to 'data.table' (setDT(mydf)
), replicate the sequence of rows by 'Days.in.Flight', based on that index, we subset the dataset (.SD[rep(...
), grouped by 'Flight.Start', and 'Flight.End', we create the 'Date' column.Here is one way with
splitstackshape
anddplyr
. UsingexpandRows()
from thesplitstackshape
package, you can expand your data frame as you described. Then, you want to add a sequence of dates usingmutate()
. What I did was to group the data by the combination ofFlight.Start
andFlight.End
, and useseq()
to create a sequence of date for each group.first()
is taking the first element ofFlight.Start
andFlight.End
. In this way, you can create the sequence you want. I hope this will help you.DATA and CODE