I'm stuck trying to subset some panel data, i.e. ids within group, using dplyr
.
I want to exact all id
s, within each group, grp
that has a NUM
series with a minimum smaller than 2 and a maximum greater than 2. I've constructed a minimal working example below that should illustrate the issue.
I have been working with filter()
, row_number() == c(1,n())
, and tried to separate it out and merge, i.e. different types of _join
, it back together, but I am stuck and I am now turning to the SO community for help.
What I got
A tibble
like this,
df <- tibble(id = rep(0:1, c(8, 13)), grp = rep(c("01", "02"), c(13, 8)),
NUM = c(-4, -3, -2, -1, 1, 2, 3, 4, -3, -2, -1,
1, 2, -3, -2, -1, 1, 2, 3, 4, 5)) %>% group_by(id, grp)
df %>% print(n=21)
#> # A tibble: 21 x 3
#> # Groups: id, grp [3]
#> id grp NUM
#> <int> <chr> <dbl>
#> 1 0 01 -4
#> 2 0 01 -3
#> 3 0 01 -2
#> 4 0 01 -1
#> 5 0 01 1
#> 6 0 01 2
#> 7 0 01 3
#> 8 0 01 4
#> 9 1 01 -3
#> 10 1 01 -2
#> 11 1 01 -1
#> 12 1 01 1
#> 13 1 01 2
#> 14 1 02 -3
#> 15 1 02 -2
#> 16 1 02 -1
#> 17 1 02 1
#> 18 1 02 2
#> 19 1 02 3
#> 20 1 02 4
#> 21 1 02 5
What I am trying to get / desired outcome
df_out <- tibble(id = rep(0:1, c(9, 8)),
grp = rep(c("01", "02"), c(9, 8)),
NUM = c(-4, -3, -2, -1, 1, 2, 3,
4, 5, -3, -2, -1, 1, 2, 3, 4, 5)) %>% group_by(id, grp)
df_out
#> # A tibble: 17 x 3
#> # Groups: id, grp [3]
#> id grp NUM
#> <int> <chr> <dbl>
#> 1 0 01 -4
#> 2 0 01 -3
#> 3 0 01 -2
#> 4 0 01 -1
#> 5 0 01 1
#> 6 0 01 2
#> 7 0 01 3
#> 8 0 01 4
#> 9 1 02 -3
#> 10 1 02 -2
#> 11 1 02 -1
#> 12 1 02 1
#> 13 1 02 2
#> 14 1 02 3
#> 15 1 02 4
#> 16 1 02 5
Like so?
In addition, if one whichs to subset on an exact values, say the first
NUM
is-3
and lastNUM
is5
, i.e. row 9-16 in the original data, tThis can be done like this,The above is inspired by this SO answer.