division as multiply and LUT ? / fast float divisi

Is it possible to make a reciprocal of float division in form of look up table (such like 1/f -> 1*inv[f] ) ? How it could be done? I think some and mask and shift should be appled to float to make it a form of index? How would be it exectly?

标签： c++ c optimization fpu

3条回答

我只想做你的唯一

2楼-- · 2019-05-27 11:01

One method is:

Extract the sign, exponent and mantissa from the input
Use some of the most significant mantissa bits to look up its reciprocal in a table
Negate the exponent, and adjust for the change of scale of the mantissa
Recombine the sign, exponent and mantissa to form the output

In step 2, you'll need to choose the number of bits to use, trading between accuracy and table size. You could obtain more accuracy by using the less significant bits to interpolate between table entries.

In step 3, the adjustment is necessary because the input mantissa was in the range (0.5, 1.0], and so its reciprocal is in the range [1.0, 2.0), which needs renormalising to give the output mantissa.

I won't try to write the code for this, since there are probably some slightly fiddly edge cases that I'd miss.

You should also investigate methods involving numerical calculations, which might give better results if memory access is slow; on a modern PC architecture, a cache miss might be as expensive as dozens of arithmetic operations. Wikipedia looks like a good starting point. And of course, whatever you do, measure it to make sure it is actually faster than an FPU division operation.

0人赞添加讨论(0) 举报

\"骚年 ilove

3楼-- · 2019-05-27 11:02

If your minimum step is something like 0.01 then you can support inverse-f from a table. Each index is multiplied by 100 so you can have

table[1]----->1.0/0.01
table[3]----->1.0/0.03
table[105]--->1.0/1.05
...
table[10000]->1.0/100.0


10000 elements for a range of (0.00,100.00)

If you want better precision, you will need more ram.

Another example:

range................: 0.000 - 1000.000
minimum increments ..: 0.001
total element number.: 1 million

something like this: table[2343]=1.0/2.343

Another example:

range................: 0.000000 - 1.000000
minimum increments ..: 0.000001
total element number.: 1 million

something like this: table[999999]=1.0/0.999999

0人赞添加讨论(0) 举报

家丑人穷心不美

4楼-- · 2019-05-27 11:06

You can guess an approximate inverse like this:

int x = reinterpret_cast<int>(f);
x = 0x7EEEEEEE - x;
float inv = reinterpret_cast<float>(x);

In my tests, 0x7EF19D07 was slightly better (tested with the effects of 2 Newton-Raphson refinements included).

Which you can then improve with Newton-Raphson:

inv = inv * (2 - inv * f);

Iterate as often as you want. 2 or 3 iterations give nice results.

Better Initial Approximations

To minimize the relative error:

0x7EF311C2 (without refinement)
0x7EF311C3 (1 refinement)
0x7EF312AC (2 refinements)
0x7EEEEBB3 (3 refinements)

To minimize the absolute error for inputs between 1 and 2 (they work well enough outside that range, but they may not be the best):

0x7EF504F3 (without refinement)
0x7EF40D2F (1 refinement)
0x7EF39252 (2 refinements)

For three refinement steps, the initial approximation barely affects the maximum relative error. 0x7EEEEEEE works great, and I couldn't find anything better.

0人赞添加讨论(0) 举报

division as multiply and LUT ? / fast float divisi

Better Initial Approximations

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间