I am trying to affect the translation of a 3D model using some UI buttons to shift the position by 0.1 or -0.1.
My model position is a three dimensional float so simply adding 0.1f to one of the values causes obvious rounding errors. While I can use something like BigDecimal to retain precision, I still have to convert it from a float and back to a float at the end and it always results in silly numbers that are making my UI look like a mess.
I could just pretty the displayed values but the rounding errors will only get worse with more editing and they make my save files rather hard to read.
So how do I actually avoid these errors when I need to use a float?
I would use a Rational
class. There are many out there - this one looks like it should work.
One significant cost will be when the Rational
is rendered into a float
and one when the denominator is reduced to the gcd
. The one I posted keeps the numerator and denominator in fully reduced state at all times which should be quite efficient if you are always adding or subtracting 1/10.
This implementation holds the values normalised (i.e. with consistent sign) but unreduced.
You should choose your implementation to best fit your usage.
The Kahan summation and pairwise summation algorithms help to reduce floating point errors. Here's some Java code for the Kahan algorithm.
A simple solution is to either use fixed precision. i.e. an integer 10x or 100x what you want.
float f = 10;
f += 0.1f;
becomes
int i = 100;
i += 1; // use an many times as you like
// use i / 10.0 as required.
I wouldn't use float
in any case as you get more rounding errors than double
for next to no benefit (unless you have millions of float values) double
gives you 8 more digits of precision and with sensible rounding would won't see those errors.
If you stick with floats:
The easiest way to avoid the error is using floats which are exact, but
near the desired value which is
round(2^n * value) * 1/2^n.
n is the number of bits, value the number to use (in your case 0.1)
In your case with increasing precision:
n = 4 => 0.125
n = 8 (byte) => 0.9765625
n = 16 (short)=> 0.100006103516....
The long number chains are artefacts of the binary conversion,
the real number has much less bits.
As the floats are exact, addition and subtraction will
not introduce offset errors, but will always be
predictable as long as the number of bits is
not longer than the float value holds.
If you fear that your display will be compromised by
using this solution (because they are odd floats), use
and store only integers (step increase -1/1).
The final value which is internally set is
x = value * step.
As the step increases or decreases by an amount of 1,
precision will be retained.