I have a data frame containing two columns: a grouping variable and a interval period over which the grouping variable holds. I have another data frame with a date column and a value column. How can I join these two tables together somewhat efficiently with dplyr+tidyverse functions?
library(dplyr)
library(lubridate)
ty <- data_frame(date = mdy(paste(1, 1 + seq(20), 2017, sep = "/")),
y = c(rnorm(7), rnorm(7, mean = 2), rnorm(6, mean = -1)))
gy <- data_frame(period = interval(mdy(c("01/01/2017", "01/08/2017", "01/15/2017")),
mdy(c("01/07/2017", "01/14/2017", "01/20/2017"))),
batch = c(1, 2, 3))
I want to build the table that is equivalent to:
ty %>% mutate(batch = c(rep(1, 7), rep(2, 7), rep(3, 6)))
Ideally, this should work reasonably quickly on data sets of up to 1,000,000 rows. Better still if it works on 100,000,000 :).