How to format a number as percentage in R?

2019-01-03 23:14发布

One of the things that used to perplex me as a newby to R was how to format a number as a percentage for printing.

For example, display 0.12345 as 12.345%. I have a number of workarounds for this, but none of these seem to be "newby friendly". For example:

set.seed(1)
m <- runif(5)

paste(round(100*m, 2), "%", sep="")
[1] "26.55%" "37.21%" "57.29%" "90.82%" "20.17%"

sprintf("%1.2f%%", 100*m)
[1] "26.55%" "37.21%" "57.29%" "90.82%" "20.17%"

Question: Is there a base R function to do this? Alternatively, is there a widely used package that provides a convenient wrapper?


Despite searching for something like this in ?format, ?formatC and ?prettyNum, I have yet to find a suitably convenient wrapper in base R. ??"percent" didn't yield anything useful. library(sos); findFn("format percent") returns 1250 hits - so again not useful. ggplot2 has a function percent but this gives no control over rounding accuracy.

标签: r formatting
9条回答
可以哭但决不认输i
2楼-- · 2019-01-03 23:38

An update, several years later:

These days there is a percent function in the scales package, as documented in krlmlr's answer. Use that instead of my hand-rolled solution.


Try something like

percent <- function(x, digits = 2, format = "f", ...) {
  paste0(formatC(100 * x, format = format, digits = digits, ...), "%")
}

With usage, e.g.,

x <- c(-1, 0, 0.1, 0.555555, 1, 100)
percent(x)

(If you prefer, change the format from "f" to "g".)

查看更多
我想做一个坏孩纸
3楼-- · 2019-01-03 23:38

I did some benchmarking for speed on these answers and was surprised to see percent in the scales package so touted, given its sluggishness. I imagine the advantage is its automatic detector for for proper formatting, but if you know what your data looks like it seems clear to be avoided.

Here are the results from trying to format a list of 100,000 percentages in (0,1) to a percentage in 2 digits:

library(microbenchmark)
x = runif(1e5)
microbenchmark(times = 100L, andrie1(), andrie2(), richie(), krlmlr())
# Unit: milliseconds
#   expr       min        lq      mean    median        uq       max
# 1 andrie1()  91.08811  95.51952  99.54368  97.39548 102.75665 126.54918 #paste(round())
# 2 andrie2()  43.75678  45.56284  49.20919  47.42042  51.23483  69.10444 #sprintf()
# 3  richie()  79.35606  82.30379  87.29905  84.47743  90.38425 112.22889 #paste(formatC())
# 4  krlmlr() 243.19699 267.74435 304.16202 280.28878 311.41978 534.55904 #scales::percent()

So sprintf emerges as a clear winner when we want to add a percent sign. On the other hand, if we only want to multiply the number and round (go from proportion to percent without "%", then round() is fastest:

# Unit: milliseconds
#        expr      min        lq      mean    median        uq       max
# 1 andrie1()  4.43576  4.514349  4.583014  4.547911  4.640199  4.939159 # round()
# 2 andrie2() 42.26545 42.462963 43.229595 42.960719 43.642912 47.344517 # sprintf()
# 3  richie() 64.99420 65.872592 67.480730 66.731730 67.950658 96.722691 # formatC()
查看更多
女痞
4楼-- · 2019-01-03 23:38

This function could transform the data to percentages by columns

percent.colmns = function(base, columnas = 1:ncol(base), filas = 1:nrow(base)){
    base2 = base
    for(j in columnas){
        suma.c = sum(base[,j])
        for(i in filas){
            base2[i,j] = base[i,j]*100/suma.c
        }
    }
    return(base2)
}
查看更多
地球回转人心会变
5楼-- · 2019-01-03 23:41

Here's my solution for defining a new function (mostly so I can play around with Curry and Compose :-) ):

library(roxygen)
printpct <- Compose(function(x) x*100, Curry(sprintf,fmt="%1.2f%%"))
查看更多
倾城 Initia
6楼-- · 2019-01-03 23:42

Check out the scales package. It used to be a part of ggplot2, I think.

library('scales')
percent((1:10) / 100)
#  [1] "1%"  "2%"  "3%"  "4%"  "5%"  "6%"  "7%"  "8%"  "9%"  "10%"

The built-in logic for detecting the precision should work well enough for most cases.

percent((1:10) / 1000)
#  [1] "0.1%" "0.2%" "0.3%" "0.4%" "0.5%" "0.6%" "0.7%" "0.8%" "0.9%" "1.0%"
percent((1:10) / 100000)
#  [1] "0.001%" "0.002%" "0.003%" "0.004%" "0.005%" "0.006%" "0.007%" "0.008%"
#  [9] "0.009%" "0.010%"
percent(sqrt(seq(0, 1, by=0.1)))
#  [1] "0%"   "32%"  "45%"  "55%"  "63%"  "71%"  "77%"  "84%"  "89%"  "95%" 
# [11] "100%"
percent(seq(0, 0.1, by=0.01) ** 2)
#  [1] "0.00%" "0.01%" "0.04%" "0.09%" "0.16%" "0.25%" "0.36%" "0.49%" "0.64%"
# [10] "0.81%" "1.00%"
查看更多
看我几分像从前
7楼-- · 2019-01-03 23:46

Check out the percent function from the formattable package:

library(formattable)
x <- c(0.23, 0.95, 0.3)
percent(x)
[1] 23.00% 95.00% 30.00%
查看更多
登录 后发表回答