Consider the following script:
list_of_numbers <- as.numeric()
for(i in 1001999498:1002000501){
list_of_numbers <- c(list_of_numbers, i)
}
write(list_of_numbers, file = "./list_of_numbers", ncolumns = 1)
The file that is produced looks like this:
[user@pc ~]$ cat list_of_numbers
1001999498
1001999499
1.002e+09
...
1.002e+09
1.002e+09
1.002e+09
1002000501
I found a couple more ranges where R does not print consistently the number format.
Now I have the following questions:
Is this a bug or is there an actual reason for this behavior?
Why just in certain ranges, why not every number above x?
I know how I can solve this like this:
options(scipen = 1000)
But are there more elegant ways than setting global options? Without converting it to a dataframe and changing the format.
It's not a bug, R chooses the shortest representation.
More precisely, in ?options
one can read:
fixed notation will be preferred unless it is more than scipen
digits wider.
So when scipen
is 0 (the default), the shortest notation is preferred.
Note that you can get the scientific notation of a number x
with format(x, scientific = TRUE)
.
In your case:
1001999499
is 10 characters long whereas its scientific notation 1.001999e+09
is longer (12 characters), so the decimal notation is kept.
1001999500
: scientific notation is 1.002e+09
, which is shorter.
- ..................... (scientific notation stays equal to
1.002e+09
, hence shorter)
1002000501
: 1.002001e+09
is longer.
You may ask: how come that 1001999500
is formatted as 1.002e+09
and not as 1.0019995e+09
? It's simply because there is also an option that controls the number of significant digits. It is named digits
and its default value is 7. Since 1.0019995
has 8 significant digits, it is rounded up to 1.002
.
The simplest way to ensure that decimal notation is kept without changing global options is probably to use format
:
write(format(list_of_numbers, scientific = FALSE, trim = TRUE),
file = "./list_of_numbers")
Side note: you didn't need a loop to generate your list_of_numbers
(which by the way is not a list but a vector). Simply use:
list_of_numbers <- as.numeric(1001999498:1002000501)