I have a dataframe as follows:
hospital <- c("PROVIDENCE ALASKA MEDICAL CENTER", "ALASKA REGIONAL HOSPITAL", "FAIRBANKS MEMORIAL HOSPITAL",
"CRESTWOOD MEDICAL CENTER", "BAPTIST MEDICAL CENTER EAST", "ARKANSAS HEART HOSPITAL",
"MEDICAL CENTER NORTH LITTLE ROCK", "CRITTENDEN MEMORIAL HOSPITAL")
state <- c("AK", "AK", "AK", "AL", "AL", "AR", "AR", "AR")
rank <- c(1,2,3,1,2,1,2,3)
df <- data.frame(hospital, state, rank)
df
hospital state rank
1 PROVIDENCE ALASKA MEDICAL CENTER AK 1
2 ALASKA REGIONAL HOSPITAL AK 2
3 FAIRBANKS MEMORIAL HOSPITAL AK 3
4 CRESTWOOD MEDICAL CENTER AL 1
5 BAPTIST MEDICAL CENTER EAST AL 2
6 ARKANSAS HEART HOSPITAL AR 1
7 MEDICAL CENTER NORTH LITTLE ROCK AR 2
8 CRITTENDEN MEMORIAL HOSPITAL AR 3
I would like to create a function, rankall, that takes rank as an argument and returns the hospitals of that rank for each state, with NAs returned if the state does not have a hospital that matches the given rank. For example, I want output of rankall(rank=3) to look like this:
hospital state
AK FAIRBANKS MEMORIAL HOSPITAL AK
AL <NA> AL
AR CRITTENDEN MEMORIAL HOSPITAL AR
I've tried:
rankall <- function(rank) {
split_by_state <- split(df, df$state)
ranked_hospitals <- lapply(split_by_state, function (x) {
x[(x$rank==rank), ]
})
combined_ranked_hospitals <- do.call(rbind, ranked_hospitals)
return(combined_ranked_hospitals[ ,1:2])
}
But rankall(rank=3) returns:
hospital state
AK FAIRBANKS MEMORIAL HOSPITAL AK
AR CRITTENDEN MEMORIAL HOSPITAL AR
This leaves out the NA values that I need to keep track of. Is there a way for R to recognize the empty rows in my list object within my function as NAs, rather than as empty rows? Is there another function besides lapply that would be more useful for this task?
[ Note: This dataframe is from the Coursera R Programming course. This is also my first post on Stackoverflow, and my first time learning programming. Thank you to all who offered solutions and advice, this forum is fantastic. ]