Is it 52 or 53 bits of floating point precision?

2019-07-04 19:00发布

I keep on seeing this nonsense about 53 bits of precision in 64-bit IEEE floating point representation. Would someone please explain to me how in the world a bit that is stuck with a 1 in it contributes ANYTHING to the numeric precision? If you had a floating point unit with bit0 stuck-on with 1, you would of course know that it produces 1 less bit of precision than normally. Where are those sensibilities on this?

Further, just the exponent, the scaling factor without the mantissa, completely specifies exactly where the leading bit of the number is, so no leading bit is ever used. The 53th bit is about as real as the 19th hole. It is merely a (useful) crutch to aid the human mind and the logic for accessing such values in binary. To claim otherwise is double counting.

Either all the books and articles claiming this 53rd bit nonsense are wrong, or I am an idiot. But a stuck bit is a stuck bit. Let's hear the arguments to the contrary.

3条回答
Evening l夕情丶
2楼-- · 2019-07-04 19:51

It's not stuck. The exponent will move the "stuck" bit around so it's not stuck at 1 position. Moreover, zero already had a representation so any values other than zero must have at least a 1 bit, therefore you don't have to store that 0 leading bit. As a result the implied 1 bit will make it easier to normalize and introduce more precision to the value.

查看更多
欢心
3楼-- · 2019-07-04 19:53

The mathematical significand1 of an IEEE-754 64-bit binary floating-point object has 53 bits. It is encoded with the combination of a 52-bit field exclusively for the significand and some information from the exponent field that indicates whether the 53rd bit is 0 or 1.

Since the main significand field is 52 bits, some people refer to the significand as 52 bits, but this is sloppy terminology. The significand field does not contain all the information about the significand, and the complete significand is 53 bits.

It is not true that the leading bit of the significand is never used (as anything other than 1). When the encoding of the exponent is zero, the leading bit of the significand is 0 instead of the more frequent 1.


1 “Significand” is the preferred term, not “mantissa.” A significand is linear, a mantissa is logarithmic.

查看更多
We Are One
4楼-- · 2019-07-04 20:02

The key concept here is "normalization". In general scientific notation, every value has many representations. That makes arithmetic, especially comparisons, more difficult than necessary. The common solution is to require the most significant digit of the significand to be non-zero. For example, the first floating point system I worked with was base 16, and the leading digit of the significand was in the range 1 through F.

That has a special effect for binary floating point. The most significant bit of the significand is a non-zero bit. There is no point wasting one of the limited number of bits in the physical representation on a bit that is known to be non-zero.

Normal numbers in IEEE 754 64-bit binary have a 53 bit significand whose implicit leading bit is known to be 1, and with the remaining 52 bits stored in the physical representation.

There being no such thing as a free lunch, there is a cost to this. The cost is a limitation on how small a number can be stored with a given exponent. For most exponents that does not matter - the number just gets stored with a smaller exponent, and still with a leading one bit that does not need to be stored.

It would be a real limitation for zero exponent, because there is no smaller exponent to use. IEEE 754 binary floating point solves that by storing very small magnitude numbers, with a zero exponent, differently. They have at most 52 significant bits, all stored, with leading zeros permitted. That allows very small magnitude numbers to be represented as non-zero numbers, at a cost of reduced precision.

Infinities and NaNs are stored differently, with the all ones exponent.

查看更多
登录 后发表回答