I have the following df:
names sex
adam M
jill F
stewart M
jordan M
alica F
jordan F
How do I filter the rows so that I only get the names that are both M and F, in this case, jordan.
I have the following df:
names sex
adam M
jill F
stewart M
jordan M
alica F
jordan F
How do I filter the rows so that I only get the names that are both M and F, in this case, jordan.
We can group by 'names' and filter
the 'sex' having unique
number of elements greater than 1
library(dplyr)
df %>%
group_by(names) %>%
filter(n_distinct(sex) > 1)
Or another option is to group by 'names' and filter
the groups having both the 'M' and 'F'
df %>%
group_by(names) %>%
filter(all(c("M", "F") %in% sex))
If all your data is like this, you can simply find rows with duplicate values:
dat[duplicated(dat$names),]
Example:
> dat <- data.frame(names = c("adam", "jill", "stewart", "jordan", "alicia", "jordan"),
+ sex = c("M", "F", "M", "M", "F", F)
+ )
> dat
names sex
1 adam M
2 jill F
3 stewart M
4 jordan M
5 alicia F
6 jordan FALSE
> dat[duplicated(dat$names),]
names sex
6 jordan FALSE
or if you want a vector of names:
> as.character(dat[duplicated(dat$names),]$names)
[1] "jordan"