Floating point calculation gives different results

2019-06-24 03:08发布

问题:

I have the following line of code.

hero->onBeingHit(ENEMY_ATTACK_POINT * (1.0 - hero->getDefensePercent()));
  • void onBeingHit(int decHP) method accepts integer number and updates health points.
  • float getDefensePercent() method is a getter method returning the defense percent of a hero.
  • ENEMY_ATTACK_POINT is a macro constant factor defined as #define ENEMY_ATTACK_POINT 20.

Let's say hero->getDefensePercent() returns 0.1. So the calculation is

20 * (1.0 - 0.1)  =  20 * (0.9)  =  18

Whenever I tried it with the following code (no f appending 1.0)

hero->onBeingHit(ENEMY_ATTACK_POINT * (1.0 - hero->getDefensePercent()));

I got 17.

But for the following code (f appended after 1.0)

hero->onBeingHit(ENEMY_ATTACK_POINT * (1.0f - hero->getDefensePercent()));

I got 18.

What's going on? Is f significant to have at all although hero->getDefensePercent() is already in float?

回答1:

What's going on? Why isn't the integer result 18 in both cases?

The problem is that the result of the floating point expression is rounded towards zero when being converted to an integer value (in both cases).

0.1 can't be represented exactly as a floating point value (in both cases). The compiler does the conversion to a binary IEEE754 floating point number and decides whether to round up or down to a representable value. The processor then multiplies this value during runtime and the result is rounded to get an integer value.

Ok, but since both double and float behave like that, why do I get 18 in one of the two cases, but 17 in the other case? I'm confused.

Your code takes the result of the function, 0.1f (a float), and then calculates 20 * (1.0 - 0.1f) which is a double expression, while 20 * (1.0f - 0.1f) is a float expression. Now the float version happens to be slightly larger than 18.0 and gets rounded down to 18, while the double expression is slightly less than 18.0 and gets rounded down to 17.

If you don't know exactly how IEEE754 binary floating point numbers are constructed from decimal numbers, it's pretty much random if it will be slightly less or slightly greater than the decimal number you've entered in your code. So you shouldn't count on this. Don't try to fix such an issue by appending f to one of the numbers and say "now it works, so I leave this f there", because another value behaves differently again.

Why depends the type of the expression on the precence of this f?

This is because a floating point literal in C and C++ is of type double per default. If you add the f, it's a float. The result of a floating point epxression is of the "greater" type. The result of a double expression and an integer is still a double expression as well as int and float will be a float. So the result of your expression is either a float or a double.

Ok, but I don't want to round to zero. I want to round to the nearest number.

To fix this issue, add one half to the result before converting it to an integer:

hero->onBeingHit(ENEMY_ATTACK_POINT * (1.0 - hero->getDefensePercent()) + 0.5);

In C++11, there is std::round() for that. In previous versions of the standard, there was no such function to round to the nearest integer. (Please see comments for details.)

If you don't have std::round, you can write it yourself. Take care when dealing with negative numbers. When converting to an integer, the number will be truncated (rounded towards zero), which means that negative values will be rounded up, not down. So we have to subtract one half if the number is negative:

int round(double x) {
    return (x < 0.0) ? (x - .5) : (x + .5);
}


回答2:

1.0 is interpreted as a double, as opposed to 1.0f which is seen by the compiler as a float.

The f suffix simply tells the compiler which is a float and which is a double.

As the name implies, a double has 2x the precision of float. In general a double has 15 to 16 decimal digits of precision, while float only has 7.

This precision loss could lead to truncation errors much easier to float up

See MSDN (C++)



回答3:

The reason why is this happening is more precise result when using double, i.e. 1.0.

Try to round your result, which will lead to more precise integral result after conversion:

hero->onBeingHit(ENEMY_ATTACK_POINT * (1.0 - hero->getDefensePercent()) + 0.5);

Note that adding 0.5 and truncating to int right after it will cause rounding of the result, so by the time your result would be 17.999..., it will become 18.499..., which will be truncated to 18