I have a df (“df”) containing multiple time series (value ~ time) whose observations are grouped by 3 factors: temp, rep, and species. These data need to be trimmed at the lower and upper ends of the time series, but these threshold values are group conditional (e.g. remove observations below 2 and above 10 where temp=10, rep=2, and species = “A”). I have an accompanying df (df_thresholds) that contains grouping values and the mins and maxs i want to use for each group. Not all groups need trimming (I would like to update this file regularly which would guide where to trim df). Can anybody help me conditionally filter out these values by group? I have the following, which is close but not quite there. When I reverse the max and min boolean tests, I get zero observations.
df <- data.frame(species = c(rep("A", 16), rep("B", 16)),
temp=as.factor(c(rep(10,4),rep(20,4),rep(10,4),rep(20,4))),
rep=as.factor(c(rep(1,8),rep(2,8),rep(1,8),rep(2,8))),
time=rep(seq(1:4),4),
value=c(1,4,8,16,2,4,9,16,2,4,10,16,2,4,15,16,2,4,6,16,1,4,8,16,1,2,8,16,2,3,4,16))
df_thresholds <- data.frame(species=c("A", "A", "B"),
temp=as.factor(c(10,20,10)),
rep=as.factor(c(1,1,2)),
min_value=c(2,4,2),
max_value=c(10,10,9))
#desired outcome
df_desired <- df[c(2:3,6:7,9:24,26:27,29:nrow(df)),]
#attempt
df2 <- df
for (i in 1:nrow(df_thresholds)) {
df2 <- df2 %>%
filter(!(species==df_thresholds$species[i] & temp==df_thresholds$temp[i] & rep==df_thresholds$rep[i] & value>df_thresholds$min_value[i] & value<df_thresholds$max_value[i]))
}
EDIT: Here's the solution I implemented per suggestions below.
df_test <- left_join(df, df_thresholds, by=c('species','temp','rep'))
df_test$min_value[is.na(df_test$min_value)] <- 0
df_test$max_value[is.na(df_test$max_value)] <- 999
df_test2 <- df_test %>%
filter(value >= min_value & value <= max_value)