The discussion started under my answer to another question. The following code determines machine epsilon:
float compute_eps() {
float eps = 1.0f;
while (1.0f + eps != 1.0f)
eps /= 2.0f;
return eps;
}
In the comments it was proposed that the 1.0f + eps != 1.0f
test might fail because C++ standard permits the use of extra precision. Although I'm aware that floating-point operations are actually performed in higher precision (than specified by the actual types used), I happen to disagree with this proposal.
I doubt that during the comparison operations, such as ==
or !=
, the operands are not truncated to the precision of their type. In other words, 1.0f + eps
can of course be evaluated with the precision higher than float
(for example, long double
), and the result will be stored in the register that can accommodate long double
. However, I think that before performing the !=
operation left operand will be truncated from long double
to float
, hence the code can never fail to determine eps
precisely (i.e. it can never do more iterations than intended).
I haven't found any clue on this particular case in C++ standard. Furthermore, the code works fine and I'm sure that the extra precision technique is used during its execution because I have no doubt that any modern desktop implementation in fact uses extra precision during calculations.
What do you think about it?
Sorry that this example is C and not C++. It should not be difficult to adapt:
gcc -msse2 -mfpmath=sse
is a strict IEEE 754 compiler. With that compiler, theif
is never taken. However,gcc -mno-sse2 -mfpmath=387
has to use the 387 unit with its higher precision. It does not reduce the precision before the!=
test. The test ends up comparing the extended-precision result of 3. / 7. on the right-hand side to the double-precision result of the same division on the left-hand side. This cause a behavior that may appear strange.Both
gcc -msse2 -mfpmath=sse
andgcc -mno-sse2 -mfpmath=387
are standard-compliant. It is only the case that the former has it easy, generating SSE2 instructions, and thus can provide a strict IEEE 754 implementation, whereas the latter has to do its best with an ancient instruction set.A loop such as:
with
eps1
declared of typefloat
should be more robust with respect to extended precision.The compiler that generates x87 code that does not truncate before comparison is this one:
Here is another: