Difference between Double.MIN_NORMAL and Double.MI

2019-01-22 17:21发布

问题:

May I know what is the difference among Double.MIN_NORMAL (introduced in 1.6) and Double.MIN_VALUE?

JavaDoc of Double.MIN_NORMAL:

A constant holding the smallest positive normal value of type double, 2-1022

JavaDoc of Double.MIN_VALUE:

A constant holding the smallest positive nonzero value of type double, 2-1074

回答1:

The answer can be found in the IEEE specification of floating point representation:

For the single format, the difference between a normal number and a subnormal number is that the leading bit of the significand (the bit to left of the binary point) of a normal number is 1, whereas the leading bit of the significand of a subnormal number is 0. Single-format subnormal numbers were called single-format denormalized numbers in IEEE Standard 754.

In other words, Double.MIN_NORMAL is the smallest possible number you can represent, provided that you have a 1 in front of the binary point (what is referred to as decimal point in a decimal system). While Double.MIN_VALUE is basically the smallest number you can represent without this constraint.



回答2:

Tldr:

Double.MIN_NORMAL gives the smallest positive IEEE-754 binary64 "normal number" (also known as normalized numbers). This equals 2-1022, which is roughly 2.225 × 10-308.

Double.MIN_VALUE gives the smallest positive IEEE-754 binary64 "subnormal number" (also known as denormalized or subnormal numbers). This equals 2-1074, which is roughly 4.94 × 10-324. (This is also the number given by .NET's Double.Epsilon.)

Tsdr:

To understand why these numbers are what they are and what's the difference between them, we'll have to look deeper. (Also read the answer by Bosonix.)

Consider the bit representation of an IEEE-754 binary64 format:

s_eee_eeee_eeee_mmmm_mmmm_mmmm_mmmm_mmmm_mmmm_mmmm_mmmm_mmmm_mmmm_mmmm_mmmm_mmmm

IEEE-754 binary64 values are derived as such:

  • If e is more than 0 and less than 2047 (2047 is 111_1111_1111 in binary),

    then value is equal to (-1)s × 2e-1023 × (1 + m × 2-52). (These are the normal numbers.)

  • If e equals 0,

    then value equals (-1)s × 2(e+1)-1023 × (0 + m × 2-52). (Besides zero, these are the subnormal numbers.)

  • If e equals 2047 and m equals 0,

    then value equals (-1)s × infinity.

  • If e equals 2047 and m not equals 0,

    then value equals NaN. (Unrelated fact: thus there is 2 × (252 - 1) different bit representations for NaN; cf. doubleToRawLongBits.)

Therefore, the smallest positive IEEE-754 binary64 normal number is equal to:

         (-1)0 × 21-1023 × (1 + 0 × 2-52)

      = 2-1022

And the smallest positive IEEE-754 binary64 subnormal number is equal to:

         (-1)0 × 2(0+1)-1023 × (0 + 1 × 2-52)

      = 2-1022 × 2-52

      = 2-1074



回答3:

For simplicity, the explanation will consider just the positive numbers.

The maximum spacing between two adjacent normalized floating point numbers 'x1' and 'x2' is 2 * epsilon * x1 (the normalized floating point numbers are not evenly spaced, they are logarithmically spaced). That means, that when a real number (i.e. the "mathematical" number) is rounded to a floating point number, the maximum relative error is epsilon, which is a constant called machine epsilon or unit roundoff, and for double precision it has the value 2^-52 (approximate value 2.22e-16).

The floating point numbers smaller than Double.MIN_NORMAL are called subnormals, and they are evenly filling the gap between 0 and Double.MIN_NORMAL. That means that the computations involving subnormals can lead to less accurate results. Using subnormals allows a calculation to lose precision more slowly when the result is small.