-->

Numeric to Alphabetic Lettering Function in R [dup

2019-07-17 08:40发布

问题:

This question already has an answer here:

  • is there a way to extend LETTERS past 26 characters e.g., AA, AB, AC…? 9 answers

I have written a function which works on the integers from 1 to 702 for converting a number to a letter in a very specific way. Here are some examples of how I would like the lettering function to work:

  • 1 -> A,
  • 2 -> B,
  • 27 -> AA,
  • 29 -> AC,
  • and so on.

We use this function for "numbering" / "lettering" our appendices in reports. I'm looking to make it more general, such that it can handle positive integers of any size. If I could easily convert the original number to base 26, this would be easier, but I do not see an easy way to do that in R.

appendix_lettering <- function(number) {
  if (number %in% 1:26) {
    return(LETTERS[[number]])
  } else if (number %in% 27:702) {
    first_digit <- (floor((number - 1) / 26))
    second_digit <- ((number - 1) %% 26) + 1
    first_letter <- LETTERS[[first_digit]]
    second_letter <- LETTERS[[second_digit]]
    return(paste0(first_letter, second_letter))
  }
}

Does anyone have suggestions for how I can most easily improve this function to handle any positive integers (or at least many more)?

回答1:

Here are some alternatives:

1) encode Let b be the base. Here b = 26. Then there are b^k appendices having k letters so for a particular appendix having number x it has n letters if n is the smallest integer for which b + b^2 + ... + b^n >= x. The LHS of this inequality is a geometric series and therefore has a closed form solution. Replacing the LHS with that expression and solving the resulting equation for n gives the formula for n in the code below. Then we subtract all b^k terms from number for which k < n and use the APL-like encode function found here (and elsewhere on the web). encode does the base conversion giving digits, a vector of digits in base base. Finally add 1 to each digit and use that as a lookup into LETTERS.

app2 <- function(number, base = 26) {
    n <- ceiling(log((1/(1 - base) - 1 - number) * (1 - base), base = base)) - 1
    digits <- encode(number - sum(base^seq(0, n-1)), rep(base, n))
    paste(LETTERS[digits + 1], collapse = "")
}

sapply(1:29, app2) # test

giving:

[1] "A"  "B"  "C"  "D"  "E"  "F"  "G"  "H"  "I"  "J"  "K"  "L"  "M"  "N"  "O" 
[16] "P"  "Q"  "R"  "S"  "T"  "U"  "V"  "W"  "X"  "Y"  "Z"  "AA" "AB" "AC"

Another test to try is:

sapply(1:60, app2, base = 3)

2) recursive solution Here is an alternative that works recursively. It computes the last letter of the Appendix number and then removes it and recursively computes the portion to its left.

app2r <- function(number, base = 26, suffix = "") {
   number1 <- number - 1
   last_digit <- number1 %% base
   rest <- number1 %/% base
   suffix <- paste0(LETTERS[last_digit + 1], suffix)
   if (rest > 0) Recall(rest, base, suffix) else suffix
}

# tests
identical(sapply(1:29, app2r), sapply(1:29, app2))
## [1] TRUE
identical(sapply(1:60, app2r, base = 3), sapply(1:60, app2, base = 3))
## [1] TRUE


回答2:

This approach works nicely, even though it has little to do with your original function. It is not perfect, but can be generalized to any amount of letters easily. This version can handle any number up to 26+26^2+26^3+26^4+26^5+26^6 = 321272406, i.e. up to 6 letters.

First, we define a function that determines the number of letters and adjusts the number, to remove the combinations with a lower number of letters.

Consider for example the number 702. It is "ZZ" in letters, but there are only 26^2 = 676 possible combinations with two letters - hence, we have to subtract the 26 single letters beforehand for the "adjusted number". Now, if the adjusted number is, e.g. 1 and we have 5 letters, the resulting word is "AAAAA", for 2 it is "AAAAB" and so on.

Here are the functions:

checknum <- function(num) {
  adnum<-num; #adjusted number
  n_lett<-1; #number of letters
  if(log(adnum,base=26) > 1) {adnum<-adnum-26; n_lett<-2}
  if(log(adnum,base=26) > 2) {adnum<-adnum-26^2; n_lett<-3}
  if(log(adnum,base=26) > 3) {adnum<-adnum-26^3; n_lett<-4}
  if(log(adnum,base=26) > 4) {adnum<-adnum-26^4; n_lett<-5}
  if(log(adnum,base=26) > 5) {adnum<-adnum-26^5; n_lett<-6}
  return(list(adnum=adnum,n_lett=n_lett))
} #this function can be adjusted for more letters or maybe improved in its form

applett2 <- function(num) {
  n_lett<-checknum(num)$n_lett;
  adnum<-checknum(num)$adnum-1;
  out<-c(rep(1,n_lett));
  for(i in 1:n_lett) {
    out[i]<-(floor(adnum/(26^(n_lett-i)))%%26)+1;
  }
  return(paste(LETTERS[out],collapse=""))
} #main function that creates the letters

applett2(26+26^2+26^3+26^4)
# "ZZZZ"
applett2(1234567)
# "BRFGI"


回答3:

It is possible to write a short function to do the conversion in R using the modulo operator (%%) and division. That R is 1-indexed makes it a bit trickier as well as fact that the representation is not really a basis since 0 does not exist, and if we had A = 0 instead we would have BA=26 and no AA, AB etc.

The following function solves this and matches your definition.

base26_conversion <- function(number){
  result <- "";
  base <- 26
  number <- number - 1
  repeat{
    result <- paste0(LETTERS[(number) %% base + 1], result)
    number <- floor(number / base) - 1
    if (!(number > -1)){
      break
    }
  }
  return(result)
}

The modulo operator will extract the current "digit" and the addition of 1 will correct the index to get the corresponding letter. The division will shift the decimal point by one digit in the 26-base. The subtraction of 1 makes sure that to exclude "0".

The function gives matching strings for the entire input range of your function and should generalize to any positive number.



回答4:

Here's a possible solution:

dec2abc<- function(number){
    digit<- LETTERS[(number-1)%%26+1]
    number<- (number - 1 - (number-1)%%26)/26
    while (number > 26){
        digit<- paste0(LETTERS[(number-1)%%26+1], digit)
        number<- (number - 1 - (number-1)%%26)/26 
        }
    digit<- paste0(LETTERS[number], digit)
    return(digit)
}

This works with any positive integer, but I think the more digits you add to the number, you'll be testing the memory abilities of your computer.