Extract fractional part of double *efficiently* in

2019-01-17 11:31发布

I'm looking to take an IEEE double and remove any integer part of it in the most efficient manner possible.

I want

1035 ->0
1045.23->0.23
253e-23=253e-23

I do not care about properly handling denormals, infinities, or NaNs. I do not mind bit twiddling, as I know I am working with IEEE doubles, so it should work across machines.

Branchless code would be much preferred.

My first thought is (in pseudo code)

char exp=d.exponent;
(set the last bit of the exponent to 1)
d<<=exp*(exp>0);
(& mask the last 52 bits of d)
(shift d left until the last bit of the exponent is zero, decrementing exp each time)
d.exponent=exp;

But the problem is that I can't think of an efficient way to shift d left until the last bit of the exponent is zero, plus it seems it would need to output zero if all of the last bits weren't set. This seems to be related to the base 2 logarithm problem.

Help with this algorithm or any better ones would be much appreciated.

I should probably note that the reason I want branchless code is because I want it to efficiently vectorize.

6条回答
贼婆χ
2楼-- · 2019-01-17 12:01

How about something simple?

double fraction = whole - ((long)whole);

This just subtracts the integer portion of the double from the value itself, the remainder should be the fractional component. It's possible, of course, this could have some representation issues.

查看更多
祖国的老花朵
3楼-- · 2019-01-17 12:01

The optimal implementation depends entirely on the target architecture.

On recent Intel processors, this can be achieved with two instructions: roundsd and subsd, but that can't be expressed in portable C code.

On some processors, the fastest way to do this is with integer operations on the floating point representation. Early Atom and many ARM CPUs come to mind.

On some other processors, the fastest thing is to cast to integer and back, then subtract, branching to protect large values.

If you're going to be handling lots of values, you can set the rounding mode to round-to-zero, then add and subtract +/-2^52 to the number truncated to integer, then subtract from the original value to get the fraction. If you don't have SSE4.1, but do have an otherwise modern Intel CPU and want to vectorize, this is typically the best you can do. It only makes sense if you have many values to process, however, because changing the rounding mode is somewhat expensive.

On other architectures, other implementations are optimal. In general, it doesn't make sense to talk about "efficiency" of C programs; only the efficiency of a specific implementation on a specific architecture.

查看更多
不美不萌又怎样
4楼-- · 2019-01-17 12:01

Standard library function modf solves this problem quite neatly.

#include <math.h>
/*...*/
double somenumber;
double integralPart;
double fractionalPart = modf(somenumber, &integralPart);

This should do what you have asked, is portable, and reasonably efficient.

An undocumented detail is whether the second argument could be NULL and then avoid the integral part temporary, which would be desirable in uses such as that you have described.

Unfortunately it seams many implementations do not support NULL for the second argument, so will have to use a temporary whether or not you use this value.

查看更多
做个烂人
5楼-- · 2019-01-17 12:06

Some profiling and experimentation using C++ in Microsoft Visual Studio 2015 indicates that the best method for positive numbers is:

double n;
// ...
double fractional_part = n - floor(n);

It's faster than modf, and, as has already been mentioned, the remainder function rounds to the nearest integer, and therefore isn't of use.

查看更多
Deceive 欺骗
6楼-- · 2019-01-17 12:08
#include <math.h>
double fraction = fmod(d, 1.0);
查看更多
仙女界的扛把子
7楼-- · 2019-01-17 12:23

Proposal

The function remainder computes the remainder, but not the integer part like modf does:

#include <math.h>

double fracpart(double input)
{
    return remainder(input, 1.);
}

This is the most efficient (and portable) way, as it doesn't compute unnecessary values to do the job (cf. modf, (long), fmod, etc.)


Benchmark

As Mattew suggested in the comments, I wrote some benchmark code to compare this solution to all the other ones offered on this page.

Please find below the time measurements for 65536 computations (compiled with Clang with optimizations turned off):

method 1 took 0.002389 seconds (using remainder)
method 2 took 0.000193 seconds (casting to long)
method 3 took 0.000209 seconds (using floor)
method 4 took 0.000257 seconds (using modf)
method 5 took 0.010178 seconds (using fmod)

Again with Clang, this time using the -O3 flag:

method 1 took 0.002222 seconds (using remainder)
method 2 took 0.000000 seconds (casting to long)
method 3 took 0.000000 seconds (using floor)
method 4 took 0.000223 seconds (using modf)
method 5 took 0.010131 seconds (using fmod)

Turns out the simplest solution seems to give the best results on most platforms, and the specific methods to perform that task (fmod, modf, remainder) are actually super-slow!

查看更多
登录 后发表回答