printing the integral part of a floating point num

2019-02-18 11:49发布

I am trying to figure out how to print floating point numbers without using library functions. Printing the decimal part of a floating point number turned out to be quite easy. Printing the integral part is harder:

static const int base = 2;
static const char hex[] = "0123456789abcdef";

void print_integral_part(float value)
{
    assert(value >= 0);
    char a[129]; // worst case is 128 digits for base 2 plus NUL
    char * p = a + 128;
    *p = 0;
    do
    {
        int digit = fmod(value, base);
        value /= base;
        assert(p > a);
        *--p = hex[digit];
    } while (value >= 1);
    printf("%s", p);
}

Printing the integral part of FLT_MAX works flawlessly with base 2 and base 16:

11111111111111111111111100000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000 (base 2)

ffffff00000000000000000000000000 (base 16)

However, printing in base 10 results in errors after the first 7 digits:

340282368002860660002286082464244022240 (my own function)
340282346638528859811704183484516925440 (printf)

I assume this is a result of the division by 10. It gets better if I use double instead of float:

340282346638528986604286022844204804240 (my own function)
340282346638528859811704183484516925440 (printf)

(If you don't believe printf, enter 2^128-2^104 into Wolfram Alpha. It is correct.)

Now, how does printf manage to print the correct result? Does it use some bigint facilities internally? Or is there some floating point trick I am missing?

7条回答
对你真心纯属浪费
2楼-- · 2019-02-18 12:19

Let's explain this one more time. After the integer part has been printed (exactly) without any rounding other than chop towards 0 it's time for the decimal bits.

Start with a string of bytes (say 100 for starters) containing binary zeros. If the first bit to the right of the decimal point in the fp value is set that means that 0.5 (2^-1 or 1/(2^1)is a component of the fraction. So add 5 to the first byte. If the next bit is set 0.25 (2^-2 or 1/(2^2)) is part of the fraction add 5 to the second byte and add 2 to the first (oh, don't forget the carry, they happen - lower school math). The next bit set means 0.125 so add 5 to the third byte, 2 to the second and 1 to the first. And so on:

      value          string of binary 0s
start 0              0000000000000000000 ...
bit 1 0.5            5000000000000000000 ...
bit 2 0.25           7500000000000000000 ...
bit 3 0.125          8750000000000000000 ...
bit 4 0.0625         9375000000000000000 ...
bit 5 0.03125        9687500000000000000 ...
bit 6 0.015625       9843750000000000000 ...
bit 7 0.0078125      9921875000000000000 ...
bit 8 0.00390625     9960937500000000000 ...
bit 9 0.001953125    9980468750000000000 ...
...

I did this by hand so I may have missed something but to implement this in code is trivial.

So for all those SO "can't get an exact result using float" people who don't know what they're talking about here is proof that floating point fraction values are perfectly exact. Excruciatingly exact. But binary.

For those who take the time to get their heads around how this works, better precision is well within reach. As for the others ... well I guess they'll keep on not browsing the fora for the answer to a question which has been answered numerous times previously, honestly believe they have discovered "broken floating point" (or whatever thay call it) and post a new variant of the same question every day.

"Close to magic," "dark incantation" - that's hilarious!

查看更多
Emotional °昔
3楼-- · 2019-02-18 12:25

I believe the problem lies in value /= base; Do not forget that 10 is not a finite fraction in binary system and thus this calculation is never correct. I also assume some error will occur in fmod due to the same reason.

printf will first compute the integral part and then convert it to decimal(if I get the way you printf the integral part correctly).

查看更多
霸刀☆藐视天下
4楼-- · 2019-02-18 12:27

This program will work for you.

#include<stdio.h>
int main()
{
    float num;
    int z;
    scanf("%f",&num);
    z=(int)num;
    printf("the integral part of the floating point number is %d",z);
}
查看更多
一夜七次
5楼-- · 2019-02-18 12:32

It appears that the work horse for the float to string conversion is the dtoa() function. See dtoa.c in newlib for how they do it.

Now, how does printf manage to print the correct result?

I think it is close to magic. At least the source looks like some kind of dark incantation.

Does it use some bigint facilities internally?

Yes, search for _Bigint in the linked source file.

Or is there some floating point trick I am missing?

Likely.

查看更多
贪生不怕死
6楼-- · 2019-02-18 12:32

Like Agent_L's answer, you're suffering from the false result caused by dividing the value by 10. Float, like any binary floating point type, cannot express correctly most rational number in decimal. After division, most of the case the result cannot be fitted into binary, so it'll be rounded. Hence the more you divide, the more error you'll realize.

If the number is not very large, a quick solution would be multiplying it with 10 or a power of 10 depending on how many digits after decimal point you need.

Another way was described here

查看更多
成全新的幸福
7楼-- · 2019-02-18 12:37

/Edit: Read Unni's answer first. This results come from http://codepad.org/TLqQzLO3.

void print_integral_part(float value)
{
    printf("input : %f\n", value);
    char a[129]; // worst case is 128 digits for base 2 plus NUL
    char * p = a + 128;
    *p = 0;
    do
    {
        int digit = fmod(value, base);
        value /= base;
        printf("interm: %f\n", value);
        *--p = hex[digit];
    } while (value >= 1);
    printf("result: %s\n", p);
}

print_integral_part(3.40282347e+38F);

to see how messed up your value gets by the value /= base operation:

input : 340282346638528859811704183484516925440.000000
interm: 34028234663852885981170418348451692544.000000
interm: 3402823466385288480057879763104038912.000000
interm: 340282359315034876851393457419190272.000000
interm: 34028234346940236846450271659753472.000000
interm: 3402823335658820218996583884128256.000000
interm: 340282327376181848531187106054144.000000
interm: 34028232737618183051678859657216.000000
interm: 3402823225404785588136713388032.000000
interm: 340282334629736780292710989824.000000
interm: 34028231951816403862828351488.000000
interm: 3402823242405304929106264064.000000
interm: 340282336046446683592065024.000000
interm: 34028232866774907300610048.000000
interm: 3402823378911210969759744.000000
interm: 340282332126513595416576.000000
interm: 34028233212651357863936.000000
interm: 3402823276229139890176.000000
interm: 340282333252413489152.000000
interm: 34028234732616232960.000000
interm: 3402823561222553600.000000
interm: 340282356122255360.000000
interm: 34028235612225536.000000
interm: 3402823561222553.500000
interm: 340282366859673.625000
interm: 34028237357056.000000
interm: 3402823735705.600098
interm: 340282363084.799988
interm: 34028237619.200001
interm: 3402823680.000000
interm: 340282368.000000
interm: 34028236.800000
interm: 3402823.600000
interm: 340282.350000
interm: 34028.234375
interm: 3402.823438
interm: 340.282349
interm: 34.028235
interm: 3.402824
interm: 0.340282
result: 340282368002860660002286082464244022240

When in doubt, throw more printfs at it ;)

查看更多
登录 后发表回答