I'm sure this has been asked before, but I don't know what to search for, so I apologise in advance.
Let's say that I have the following data frame:
grades <- data.frame(a = 1:40, b = sample(45:100, 40))
Using deplyr, I want to create a new variable that indicates the grade the student received, based on the following criteria: 90-100 = excellent, 80-90 = very good, etc.
I thought I could use the following to get that result with nestling ifelse() inside of mutate():
grades %>%
mutate(ifelse(b >= 90, "excellent"),
ifelse(b >= 80 & b < 90, "very_good"),
ifelse(b >= 70 & b < 80, "fair"),
ifelse(b >= 60 & b < 70, "poor", "fail"))
This doesn't work, as I get the error message "argument no is missing, with no default"). I thought the "no" would be the "fail" at the end, but obviously I'm getting the syntax wrong.
I can get this to get if I first filter the original data individually, and then call ifelse, as follows:
a <- grades %>%
filter( b >= 90) %>%
mutate(final = ifelse(b >= 90, "excellent"))
and the rbind a, b, c, etc. Obviously,this isn't how I want to do it, but I wanted to understand the syntax of ifelse(). I'm guessing the latter works because there aren't any values that don't fill the criteria, but I still can't figure out how to get it to work when there is more than one ifelse.
All of the
ifelse
s need to be within each other. Try this:Define vectors with the levels and labels and then use
cut
on theb
column:Or using data.table:
Or simply in base R:
Note
After taking another close look at your initial approach, I noticed that you would need to include
right = FALSE
in thecut
call, because for example, 90 points should be "excellent", not just "very good". So it is used to define where the interval should be closed (left or right) and the default is on the right, which is slightly different from OP's initial approach. So in dplyr, it would then be:and accordingly in the other options.