Is there a fast C or C++ standard library function

2020-04-03 16:03发布

问题:

I find myself typing

double foo=1.0/sqrt(...);

a lot, and I've heard that modern processors have built-in inverse square root opcodes.

Is there a C or C++ standard library inverse square root function that

  1. uses double precision floating point?
  2. is as accurate as 1.0/sqrt(...)?
  3. is just as fast or faster than the result of 1.0/sqrt(...)?

回答1:

No. No, there isn't. Not in C++. Nope.



回答2:

You can use this function for faster inverse square root computing
There's an article on wikipedia on how it works: https://en.wikipedia.org/wiki/Fast_inverse_square_root
Also there's a C version of this algorithm.

float invSqrt( float number ){
    union {
        float f;
        uint32_t i;
    } conv;

    float x2;
    const float threehalfs = 1.5F;

    x2 = number * 0.5F;
    conv.f  = number;
    conv.i  = 0x5f3759df - ( conv.i >> 1 );
    conv.f  = conv.f * ( threehalfs - ( x2 * conv.f * conv.f ) );
    return conv.f;
}


回答3:

Violating constraints 1. and 2. (and it's also not standard), but it still might help someone browsing through...

I used ASMJIT to just-in-time compile the exact assembly operation you're looking for: RSQRTSS (single precision, ok, but it should be similar with double).

My code is this (cf. also my answer in a different post):

   typedef float(*JITFunc)();

   JITFunc func;
   asmjit::JitRuntime jit_runtime;
   asmjit::CodeHolder code;
   code.init(jit_runtime.getCodeInfo());

   asmjit::X86Compiler cc(&code);
   cc.addFunc(asmjit::FuncSignature0<float>());

   float value = 2.71; // Some example value.
   asmjit::X86Xmm x = cc.newXmm();
   uint32_t *i = reinterpret_cast<uint32_t*>(&value);
   cc.mov(asmjit::x86::eax, i[0]);
   cc.movd(x, asmjit::x86::eax);

   cc.rsqrtss(x, x);   // THE asm function.

   cc.ret(x);

   cc.endFunc();
   cc.finalize();

   jit_runtime.add(&func, &code);

   // Now, func() can be used as the result to rsqrt(value).

If you do the JIT compilation part only once, calling it later with different values, this should be faster (though slightly less accurate, but this is inherent to the built-in operations you're talking about) than 1.0/sqrt(...).



回答4:

If your not afraid of using your own functions, try the following:

template <typename T>
T invsqrt(T x)
{
    return 1.0 / std::sqrt(x);
}

It should be just as fast as the orginal 1.0 / std::sqrt(x) in any modernly optimized compiler. Also, it can be used with doubles or floats.



回答5:

If you find yourself writing the same thing over and over, you should think to yourself "function!":

double invsqrt(const double x)
{
    return 1.0 / std::sqrt(x);
}

Now the code is more self-documenting: people don't have to deduce 1.0 / std::sqrt(x) is the inverse square root, they read it. Additionally, you now get to plug in whatever implementation you want and each call-site automatically uses the updated definition.

To answer your question, no, there is no C(++) function for it, but now that you've made one if you find your performance is too lacking you can substitute your own definition.



回答6:

why not try this? #define INSQRT(x) (1.0/sqrt(x))

Its just as fast, requires less typing(makes you feel like its a function), uses double precision, as accurate as 1/sqrt(..)



标签: c++ c double sqrt