Speedup a short to float cast?

2019-01-24 14:21发布

I have a short to float cast in C++ that is bottlenecking my code.

The code translates from a hardware device buffer which is natively shorts, this represents the input from a fancy photon counter.

float factor=  1.0f/value;
for (int i = 0; i < W*H; i++)//25% of time is spent doing this
{
    int value = source[i];//ushort -> int
    destination[i] = value*factor;//int*float->float
}

A few details

  1. Value should go from 0 to 2^16-1, it represents the pixel values of a highly sensitive camera

  2. I'm on a multicore x86 machine with an i7 processor (i7 960 which is SSE 4.2 and 4.1).

  3. Source is aligned to an 8 bit boundary (a requirement of the hardware device)

  4. W*H is always divisible by 8, most of the time W and H are divisible by 8

This makes me sad, is there anything I can do about it?

I am using Visual Studios 2012...

7条回答
Anthone
2楼-- · 2019-01-24 15:01

You could try to approximate the expression

float factor = 1.0f/value;

by an fraction numerator/denomitator where both numerator and denominator are ints. This can be done to the precision you need in your application like

int denominator = 10000;
int numerator = factor * denominator;

Then you can do your computation in integer arithmetics like

int value = source[i];
destination[i] = (value * numerator) / numerator;

You have to take care of overflows, perhaps you need to switch to long (or even long long on 64bit systems) for the calculation.

查看更多
登录 后发表回答