I was looking for a way to format large numbers in R
as 2.3K
or 5.6M
. I found this solution on SO. Turns out, it shows some strange behaviour for some input vectors.
Here is what I am trying to understand -
# Test vector with weird behaviour
x <- c(302.456500093388, 32553.3619756151, 3323.71232001074, 12065.4076372462,
0, 6270.87962956305, 383.337515655172, 402.20778095643, 19466.0204345063,
1779.05474064539, 1467.09928489114, 3786.27112222457, 2080.08078309959,
51114.7097545816, 51188.7710104291, 59713.9414049798)
# Formatting function for large numbers
comprss <- function(tx) {
div <- findInterval(as.numeric(gsub("\\,", "", tx)),
c(1, 1e3, 1e6, 1e9, 1e12) )
paste(round( as.numeric(gsub("\\,","",tx))/10^(3*(div-1)), 1),
c('','K','M','B','T')[div], sep = '')
}
# Compare outputs for the following three commands
x
comprss(x)
sapply(x, comprss)
We can see that comprss(x)
produces 0k
as the 5th element which is weird, but comprss(x[5])
gives us the expected results. The 6th element is even weirder.
As far as I know, all the functions used in the body of comprss
are vectorised. Then why do I still need to sapply
my way out of this?