R Extracting following days after signal in time s

2019-08-26 06:56发布


In my example I have a data frame with 3 columns: date, signal and value. Now I want to mutate a new columns, which are conditioned on the signals.

If there is a signal on a previous day (ifelse(lag(signal) == 1), then give me the next two following days (else = NA). But in this case I have three different signals (1,2,3).

With this code I get only the first following day for signal 1. But I also want to have the second following day. And I want to calculate multiple columns for the different signals (maybe with crossing the number of following days with the signals).

df %>% mutate(calculation = ifelse(lag(signal) == 1,

Here is my example data:



df <- tibble(date   = today()+0:10,
             signal = c(0,1,0,0,2,0,0,3,0,0,0),
             value  = sample.int(n=11))
# A tibble: 11 x 3
   date       signal value
   <date>      <dbl> <int>
 1 2019-07-23      0     3
 2 2019-07-24      1    11
 3 2019-07-25      0     2
 4 2019-07-26      0     6
 5 2019-07-27      2    10
 6 2019-07-28      0     5
 7 2019-07-29      0     4
 8 2019-07-30      3     9
 9 2019-07-31      0     8
10 2019-08-01      0     1
11 2019-08-02      0     7

And here is my desired output:

# A tibble: 11 x 3
   date       signal value   new_col_day1_sig_1  new_col_day2_sig_1  new_col_day1_sig_2
   <date>      <dbl> <int>
 1 2019-07-23      0     3                 NA                   NA                   NA
 2 2019-07-24      1    11                 NA                   NA                   NA
 3 2019-07-25      0     2                  2                    2                   NA
 4 2019-07-26      0     6                 NA                    6                   NA
 5 2019-07-27      2    10                 NA                   NA                   NA
 6 2019-07-28      0     5                 NA                   NA                    5
 7 2019-07-29      0     4                 NA                   NA                   NA
 8 2019-07-30      3     9                 NA                   NA                   NA
 9 2019-07-31      0     8                 NA                   NA                   NA
10 2019-08-01      0     1                 NA                   NA                   NA
11 2019-08-02      0     7                 NA                   NA                   NA

....and so on...(the next colmns should be new_col_day2_sig_2, new_col_day1_sig_3, new_col_day2_sig_3)

I would like to have a dynamic solution, because I would like to have not only the following two days, but up to seven consecutive days. And the solution shgould regard the different signals (1,2,3).

And the solution should also work with overlapping events.

Can you help me to solve my problem?


df %>% 
   mutate(calculation=ifelse( (lag(signal, 2) == 1) | (lag(signal) == 1), value, NA))

This is of course not good enough, since you want to have an extensible solution. Let us try harder:

anylag <- function(x, n) {
  l <- lapply(1:n, function(i) lag(x, i) == 1)
  Reduce("|", l)

df %>% mutate(calculation=ifelse(anylag(signal, 3), value, NA))


# A tibble: 11 x 4
   date       signal value calculation
   <date>      <dbl> <int>       <int>
 1 2019-07-19      0     4          NA
 2 2019-07-20      1     8          NA
 3 2019-07-21      0    11          11
 4 2019-07-22      0    10          10
 5 2019-07-23      0     7           7
 6 2019-07-24      0     1          NA
 7 2019-07-25      1     3          NA
 8 2019-07-26      0     9           9
 9 2019-07-27      0     2           2
10 2019-07-28      0     6           6
11 2019-07-29      0     5          NA

Note. Your signal is of type double. You should never use == or %in% to compare doubles, because of the limited floating point precision. Either convert it to integer or use all_equal(). Consider this:

> 3*.1 / 3 * 10 
[1] 1
> 3*.1 / 3 * 10 == 1
> all.equal(3*.1 / 3 * 10, 1)
[1] TRUE