I have written a function in R that will convert a data frame containing letter grades into numeric grades. I then use sapply() on each column of the data frame. Is there a simpler way to do this that doesn't require three separate calls of sapply? Is there a way to apply a function to every element of a data frame instead of every row or column?
The source data, "grades", looks like this:
grades <- read.table("Grades.txt", header = TRUE)
head(grades)
final_exam quiz_avg homework_avg
1 C A A
2 C- B- A
3 D+ B+ A
4 B+ B+ A
5 F B+ A
6 B A- A
My "convert_grades
" function looks like this:
convert_grades <- function(x) {
if (x == "A+") {
x <- 4.3
} else if (x == "A") {
x <- 4
} else if (x == "A-") {
x <- 3.7
} else if (x == "B+") {
x <- 3.3
} else if (x == "B") {
x <- 3
} else if (x == "B-") {
x <- 2.7
} else if (x == "C+") {
x <- 2.3
} else if (x == "C") {
x <- 2
} else if (x == "C-") {
x <- 1.7
} else if (x == "D+") {
x <- 1.3
} else if (x == "D") {
x <- 1
} else if (x == "D-") {
x <- 0.7
} else if (x == "F") {
x <- 0
} else {
x <- NA
}
return(x)
}
My current approach is as follows:
num_grades <- grades
num_grades[, 1] <- sapply(grades[, 1], convert_grades)
num_grades[, 2] <- sapply(grades[, 2], convert_grades)
num_grades[, 3] <- sapply(grades[, 3], convert_grades)
head(num_grades)
final_exam quiz_avg homework_avg
1 2.0 4.0 4
2 1.7 2.7 4
3 1.3 3.3 4
4 3.3 3.3 4
5 0.0 3.3 4
6 3.0 3.7 4
I would rewrite your
convert_grades
function as follows:Then, I would do the conversion like this:
Here is a pretty fast hash approach that shines the more grades you have:
First vectorize your function: you could do this with
ifelse
, or: