Changing Million/Billion abbreviations into actual

2020-01-29 18:35发布

问题:

As the title suggests I'm looking for a way to transform short hand abbreviated 'character' text to numerical data. For example I'd like to make these changes within my dataframe:

84.06M -> 84,060,000
30.12B -> 30,120,000,000
9.78B -> 9,780,000,000
251.29M -> 251,29,000

Here's an example of some of the dataframe I'm working with:

    Index Market Cap    Income   Sales Book/sh
ZX              -     84.06M    -1.50M 359.50M    7.42
ZTS       S&P 500     30.13B   878.00M   5.02B    3.49
ZTR             -          -         -       -       -
ZTO             -      9.78B   288.30M   1.47B    4.28
ZPIN            -      1.02B    27.40M 285.20M    4.27
ZOES            -    251.29M    -0.20M 294.10M    6.79
ZNH             -     10.92B   757.40M  17.26B   33.23
ZF              -          -         -       -       -
ZEN             -      2.78B  -106.70M 363.60M    3.09
ZBK             -      6.06B         -   2.46B   34.65
ZBH       S&P 500     22.76B   712.00M   7.78B   50.94

Does anyone have some suggestions? I was thinking gsub in base r...

回答1:

You can try this

num <- c("1.23M", "15.69B", "123.999M")
num <- gsub('B', 'e9', num)
num <- gsub('M', 'e6', num)
format(as.numeric(num), scientific = FALSE, big.mark = ",")

"84,060,000" "30,120,000,000" "251,290,000"


回答2:

Try this:

income <- c("84.06M", "30.12B", "251.29M")

toInteger <- function(income){
  amt <- as.numeric(gsub("[A-Z]", "", income))
  multiplier <- substring(income, nchar(income))
  multiplier <- dplyr::case_when(multiplier == "M" ~ 1e6,
                                 multiplier == "B" ~ 1e9,
                                 TRUE ~ 1) # you can add on other conditions for more suffixes
  amt*multiplier
}

>toInteger(income)
[1] 8.4060e+07 3.0120e+10 2.5129e+08


回答3:

You can change all your columns like this:

test = c("30.13B","84.06M","84.06B","84.06M")
values = sapply(strsplit(test,c("B","M")),function(x) as.numeric(x))
amount = sapply(strsplit(test,""), function(x) x[length(x)])
values2 = sapply(1:length(amount),function(x) ifelse(amount[x] == "B",values[x]*1e9,values[x]*1e6))

just replace test with the dataframe column you want to change and value for the dataframe name and the column you are changing