This question already has an answer here:
-
filter for complete cases in data.frame using dplyr (case-wise deletion)
6 answers
I tried to remove NA's from the subset using dplyr piping. Is my answer an indication of a missed step. I'm trying to learn how to write functions using dplyr:
> outcome.df%>%
+ group_by(Hospital,State)%>%
+ arrange(desc(HeartAttackDeath,na.rm=TRUE))%>%
+ head()
Source: local data frame [6 x 5]
Groups: Hospital, State
Hospital State HeartAttackDeath
1 ABBEVILLE AREA MEDICAL CENTER SC NA
2 ABBEVILLE GENERAL HOSPITAL LA NA
3 ABBOTT NORTHWESTERN HOSPITAL MN 12.3
4 ABILENE REGIONAL MEDICAL CENTER TX 17.2
5 ABINGTON MEMORIAL HOSPITAL PA 14.3
6 ABRAHAM LINCOLN MEMORIAL HOSPITAL IL NA
Variables not shown: HeartFailureDeath (dbl), PneumoniaDeath
(dbl)
I don't think desc
takes an na.rm
argument... I'm actually surprised it doesn't throw an error when you give it one. If you just want to remove NA
s, use na.omit
:
outcome.df %>%
na.omit() %>%
group_by(Hospital, State) %>%
arrange(desc(HeartAttackDeath)) %>%
head()
If you only want to remove NA
s from the HeartAttackDeath column, filter with is.na
:
outcome.df %>%
filter(!is.na(HeartAttackDeath)) %>%
group_by(Hospital, State) %>%
arrange(desc(HeartAttackDeath)) %>%
head()
As pointed out at the dupe, complete.cases
can also be used, but it's a bit trickier to put in a chain because it takes a data frame as an argument but returns an index vector. So you could use it like this:
outcome.df %>%
filter(complete.cases(.)) %>%
group_by(Hospital, State) %>%
arrange(desc(HeartAttackDeath)) %>%
head()