Function write() inconsistent with number notation

2019-07-18 03:52发布

问题:

Consider the following script:

list_of_numbers <- as.numeric()
for(i in 1001999498:1002000501){
  list_of_numbers <- c(list_of_numbers, i)
}
write(list_of_numbers, file = "./list_of_numbers", ncolumns = 1)

The file that is produced looks like this:

[user@pc ~]$ cat list_of_numbers
1001999498
1001999499
1.002e+09
...
1.002e+09
1.002e+09
1.002e+09
1002000501

I found a couple more ranges where R does not print consistently the number format.

Now I have the following questions:

Is this a bug or is there an actual reason for this behavior? Why just in certain ranges, why not every number above x?

I know how I can solve this like this:

options(scipen = 1000)

But are there more elegant ways than setting global options? Without converting it to a dataframe and changing the format.

回答1:

It's not a bug, R chooses the shortest representation.

More precisely, in ?options one can read:

fixed notation will be preferred unless it is more than scipen digits wider.

So when scipen is 0 (the default), the shortest notation is preferred.

Note that you can get the scientific notation of a number x with format(x, scientific = TRUE).

In your case:

  • 1001999499 is 10 characters long whereas its scientific notation 1.001999e+09 is longer (12 characters), so the decimal notation is kept.
  • 1001999500: scientific notation is 1.002e+09, which is shorter.
  • ..................... (scientific notation stays equal to 1.002e+09, hence shorter)
  • 1002000501: 1.002001e+09 is longer.

You may ask: how come that 1001999500 is formatted as 1.002e+09 and not as 1.0019995e+09? It's simply because there is also an option that controls the number of significant digits. It is named digits and its default value is 7. Since 1.0019995 has 8 significant digits, it is rounded up to 1.002.

The simplest way to ensure that decimal notation is kept without changing global options is probably to use format:

write(format(list_of_numbers, scientific = FALSE, trim = TRUE), 
      file = "./list_of_numbers")

Side note: you didn't need a loop to generate your list_of_numbers (which by the way is not a list but a vector). Simply use:

list_of_numbers <- as.numeric(1001999498:1002000501)