I have the following line of code.
hero->onBeingHit(ENEMY_ATTACK_POINT * (1.0 - hero->getDefensePercent()));
void onBeingHit(int decHP)
method accepts integer number and updates health points.float getDefensePercent()
method is a getter method returning the defense percent of a hero.ENEMY_ATTACK_POINT
is a macro constant factor defined as#define ENEMY_ATTACK_POINT 20
.
Let's say hero->getDefensePercent()
returns 0.1
. So the calculation is
20 * (1.0 - 0.1) = 20 * (0.9) = 18
Whenever I tried it with the following code (no f
appending 1.0
)
hero->onBeingHit(ENEMY_ATTACK_POINT * (1.0 - hero->getDefensePercent()));
I got 17.
But for the following code (f
appended after 1.0
)
hero->onBeingHit(ENEMY_ATTACK_POINT * (1.0f - hero->getDefensePercent()));
I got 18.
What's going on? Is f
significant to have at all although hero->getDefensePercent()
is already in float?
What's going on? Why isn't the integer result
18
in both cases?The problem is that the result of the floating point expression is rounded towards zero when being converted to an integer value (in both cases).
0.1
can't be represented exactly as a floating point value (in both cases). The compiler does the conversion to a binary IEEE754 floating point number and decides whether to round up or down to a representable value. The processor then multiplies this value during runtime and the result is rounded to get an integer value.Ok, but since both
double
andfloat
behave like that, why do I get18
in one of the two cases, but17
in the other case? I'm confused.Your code takes the result of the function,
0.1f
(a float), and then calculates20 * (1.0 - 0.1f)
which is a double expression, while20 * (1.0f - 0.1f)
is a float expression. Now the float version happens to be slightly larger than18.0
and gets rounded down to18
, while the double expression is slightly less than18.0
and gets rounded down to17
.If you don't know exactly how IEEE754 binary floating point numbers are constructed from decimal numbers, it's pretty much random if it will be slightly less or slightly greater than the decimal number you've entered in your code. So you shouldn't count on this. Don't try to fix such an issue by appending
f
to one of the numbers and say "now it works, so I leave thisf
there", because another value behaves differently again.Why depends the type of the expression on the precence of this
f
?This is because a floating point literal in C and C++ is of type
double
per default. If you add thef
, it's a float. The result of a floating point epxression is of the "greater" type. The result of a double expression and an integer is still a double expression as well as int and float will be a float. So the result of your expression is either a float or a double.Ok, but I don't want to round to zero. I want to round to the nearest number.
To fix this issue, add one half to the result before converting it to an integer:
In C++11, there is
std::round()
for that. In previous versions of the standard, there was no such function to round to the nearest integer. (Please see comments for details.)If you don't have
std::round
, you can write it yourself. Take care when dealing with negative numbers. When converting to an integer, the number will be truncated (rounded towards zero), which means that negative values will be rounded up, not down. So we have to subtract one half if the number is negative:The reason why is this happening is more precise result when using
double
, i.e.1.0
.Try to round your result, which will lead to more precise integral result after conversion:
Note that adding
0.5
and truncating toint
right after it will cause rounding of the result, so by the time your result would be17.999...
, it will become18.499...
, which will be truncated to18
1.0 is interpreted as a double, as opposed to 1.0f which is seen by the compiler as a float.
The f suffix simply tells the compiler which is a float and which is a double.
As the name implies, a double has 2x the precision of float. In general a double has 15 to 16 decimal digits of precision, while float only has 7.
This precision loss could lead to truncation errors much easier to float up
See MSDN (C++)