How to efficiently compare the sign of two floatin

2019-02-08 22:40发布

问题:

Given two floating-point numbers, I'm looking for an efficient way to check if they have the same sign, given that if any of the two values is zero (+0.0 or -0.0), they should be considered to have the same sign.

For instance,

  • SameSign(1.0, 2.0) should return true
  • SameSign(-1.0, -2.0) should return true
  • SameSign(-1.0, 2.0) should return false
  • SameSign(0.0, 1.0) should return true
  • SameSign(0.0, -1.0) should return true
  • SameSign(-0.0, 1.0) should return true
  • SameSign(-0.0, -1.0) should return true

A naive but correct implementation of SameSign in C++ would be:

bool SameSign(float a, float b)
{
    if (fabs(a) == 0.0f || fabs(b) == 0.0f)
        return true;

    return (a >= 0.0f) == (b >= 0.0f);
}

Assuming the IEEE floating-point model, here's a variant of SameSign that compiles to branchless code (at least with with Visual C++ 2008):

bool SameSign(float a, float b)
{
    int ia = binary_cast<int>(a);
    int ib = binary_cast<int>(b);

    int az = (ia & 0x7FFFFFFF) == 0;
    int bz = (ib & 0x7FFFFFFF) == 0;
    int ab = (ia ^ ib) >= 0;

    return (az | bz | ab) != 0;
}

with binary_cast defined as follow:

template <typename Target, typename Source>
inline Target binary_cast(Source s)
{
    union
    {
        Source  m_source;
        Target  m_target;
    } u;
    u.m_source = s;
    return u.m_target;
}

I'm looking for two things:

  1. A faster, more efficient implementation of SameSign, using bit tricks, FPU tricks or even SSE intrinsics.

  2. An efficient extension of SameSign to three values.

Edit:

I've made some performance measurements on the three variants of SameSign (the two variants described in the original question, plus Stephen's one). Each function was run 200-400 times, on all consecutive pairs of values in an array of 101 floats filled at random with -1.0, -0.0, +0.0 and +1.0. Each measurement was repeated 2000 times and the minimum time was kept (to weed out all cache effects and system-induced slowdowns). The code was compiled with Visual C++ 2008 SP1 with maximum optimization and SSE2 code generation enabled. The measurements were done on a Core 2 Duo P8600 2.4 Ghz.

Here are the timings, not counting the overhead of fetching input values from the array, calling the function and retrieving the result (which amount to 6-7 clockticks):

  • Naive variant: 15 ticks
  • Bit magic variant: 13 ticks
  • Stephens's variant: 6 ticks

回答1:

If you don't need to support infinities, you can just use:

inline bool SameSign(float a, float b) {
    return a*b >= 0.0f;
}

which is actually pretty fast on most modern hardware, and is completely portable. It doesn't work properly in the (zero, infinity) case however, because zero * infinity is NaN, and the comparison will return false, regardless of the signs. It will also incur a denormal stall on some hardware when a and b are both tiny.



回答2:

perhaps something like:

inline bool same_sign(float a, float b) {
    return copysignf(a,b) == a;
}

see the man page for copysign for more info on what it does (also you may want to check that -0 != +0)

or possibly this if you have C99 functions

inline bool same_sign(float a, float b) {
    return signbitf(a) == signbitf(b);
}

as a side note, on gcc at least both copysign and signbit are builtin functions so they should be fast, if you want to make sure the builtin version is being used you can do __builtin_signbitf(a)

EDIT: this should also be easy to extend to the 3 value case as well (actually both of these should...)

inline bool same_sign(float a, float b, float c) {
    return copysignf(a,b) == a && copysignf(a,c) == a;
}

// trust the compiler to do common sub-expression elimination
inline bool same_sign(float a, float b, float c) {
    return signbitf(a) == signbitf(b) && signbitf(a) == signbitf(c);
}

// the manpages do not say that signbit returns 1 for negative... however
// if it does this should be good, (no branches for one thing...)
inline bool same_sign(float a, float b, float c) {
    int s = signbitf(a) + signbitf(b) + signbitf(c);
    return !s || s==3;
}


回答3:

A small note on signbit: The macro returns an int and the man page states that "It returns a nonzero value if the value of x has its sign bit set." This means that the Spudd86's bool same_sign() is not guaranteed to work in case signbit returns two different non-zero int's for two different negative values.

Casting to bool first ensures a correct return value:

inline bool same_sign(float a, float b) {
    return (bool)signbitf(a) == (bool)signbitf(b);
}