Precise value of double in decimal and floating?

2019-09-08 18:04发布

问题:

DBL_MAX is

179769313486231570814527423731704356798070567525844996598917476803157260
780028538760589558632766878171540458953514382464234321326889464182768467
546703537516986049910576551282076245490090389328944075868508455133942304
583236903222948165808559332123348274797826204144723168738177180919299881
250404026184124858368.000000

But if I do:

double a=1234567890123456789.0;
printf("%f",a);

1234567890123456768.000000

Here the precision is 17 digits.

double a=0.1234567890123456789;
printf("%.20f",a);

0.1234567890123456773

Here also the precision after the floating point is 17 digits:

double a=1234567890.1234567890123456789;
printf("%.20f",a);

That will generate:

1234567890.12345671653747558594

Now, the precision will be 10 digits for decimal + 7 after the floating point which makes 17.

Does that mean that I have only 17 digits to get a precise value for double? If yes, why is the number of digits of DBL_MAX over 300 digits?

回答1:

Typical double has 15-16 decimal digits of significance - as compared to the next double.

All finite double have exact values like with OP's DBL_MAX = 1797693134862315708.... Floating point numbers are distributed logarithmically. So typically there are as many different double between 0.5 and 1.0 as there are between 2 and 4.

This distribution means OP's 2nd largest number only matches DBL_MAX to first 16 places out of the 300+ digits. See below.


Does it mean that I have only 17 digits to get a precise value for double ? If yes what is the use of all that number of digits for DBL_MAX more than 300 digit?

double, as binary64 can represent exactly about 2**64 different numbers. 0.1234567890123456789 is not one of them. The closest double is 0.1234567890123456773... Notice it matches to at least 15 (DBL_DIG) digits.

Usually this means the exact value of a double is not as important as its relative difference to the next double.

The value of 1e300 or 1e-300 is one of range, not precision.


int main(void) {
  printf("%.*e\n", DBL_DECIMAL_DIG + 2, DBL_MAX);
  printf("%.*e\n", DBL_DECIMAL_DIG + 2, nextafter(DBL_MAX,0));
  printf("%.*e\n", DBL_DECIMAL_DIG + 2, nextafter(0, 1));
}

Output

1.7976931348623157081e+308
1.7976931348623155086e+308
4.9406564584124654418e-324

Consider using printf("%.*e", DBL_DECIMAL_DIG-1, a); rather than printf("%.20f",a); to get a more clear view of the decimal notation significance of a double. If you are ready for hexadecimal notation, try printf("%a",a);



回答2:

The commonest system for representing C doubles is IEEE 754 64-bit binary floating point. It is a base 2 system. That means the exactly representable numbers have relatively short binary representations - fitting in 64 bits - but not necessarily short decimal representations.

The exact value of the largest finite number is 2^1024 - 2^971 or 179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368

That is not very useful information, because of the granularity of the numbers. Consecutive numbers differ by about one part in 10^16.9. For example, the next to largest representable number is 179769313486231550856124328384506240234343437157459335924404872448581845754556114388470639943126220321960804027157371570809852884964511743044087662767600909594331927728237078876188760579532563768698654064825262115771015791463983014857704008123419459386245141723703148097529108423358883457665451722744025579520

You don't need all those digits to distinguish them. If you know you have one of those numbers, seeing the 17 character prefixes 17976931348623157 and 17976931348623155 is enough to tell which.

Library functions that convert double to decimal strings have a problem. How many digits should you print? Here are some of the options:

  • Print the exact value. Almost always the wrong thing to do, because you will print a lot of digits that do not convey any useful information.
  • Print enough digits to distinguish which double you started with. That is the choice made, for example, for Java's Double.toString. This allows exact recovery of the double from the printed value.
  • Print only a fixed number of digits. Makes for tidy reports.