Consider the following script:
list_of_numbers <- as.numeric()
for(i in 1001999498:1002000501){
list_of_numbers <- c(list_of_numbers, i)
}
write(list_of_numbers, file = "./list_of_numbers", ncolumns = 1)
The file that is produced looks like this:
[user@pc ~]$ cat list_of_numbers
1001999498
1001999499
1.002e+09
...
1.002e+09
1.002e+09
1.002e+09
1002000501
I found a couple more ranges where R does not print consistently the number format.
Now I have the following questions:
Is this a bug or is there an actual reason for this behavior? Why just in certain ranges, why not every number above x?
I know how I can solve this like this:
options(scipen = 1000)
But are there more elegant ways than setting global options? Without converting it to a dataframe and changing the format.
It's not a bug, R chooses the shortest representation.
More precisely, in
?options
one can read:So when
scipen
is 0 (the default), the shortest notation is preferred.Note that you can get the scientific notation of a number
x
withformat(x, scientific = TRUE)
.In your case:
1001999499
is 10 characters long whereas its scientific notation1.001999e+09
is longer (12 characters), so the decimal notation is kept.1001999500
: scientific notation is1.002e+09
, which is shorter.1.002e+09
, hence shorter)1002000501
:1.002001e+09
is longer.You may ask: how come that
1001999500
is formatted as1.002e+09
and not as1.0019995e+09
? It's simply because there is also an option that controls the number of significant digits. It is nameddigits
and its default value is 7. Since1.0019995
has 8 significant digits, it is rounded up to1.002
.The simplest way to ensure that decimal notation is kept without changing global options is probably to use
format
:Side note: you didn't need a loop to generate your
list_of_numbers
(which by the way is not a list but a vector). Simply use: