Why is it mandatory to use -ffast-math with g++ to achieve the vectorization of loops using doubles? I don't like -ffast-math because I don't want to lose precision.

标签： gcc g++ double vectorization fast-math

3条回答

傲

2楼-- · 2019-04-08 22:02

Very likely because vectorization means that you may have different results, or may mean that you miss floating point signals/exceptions.

If you're compiling for 32 bit x86 then gcc and g++ default to using the x87 for floating point math, on 64 bit they default to sse, however the x87 can and will produce different values for the same computation so it's unlikely g++ will consider vectorizing if it can't guarantee that you will get the same results unless you use -ffast-math or some of the flags it turns on.

Basically it comes down to the floating point environment for vectorized code may not be the same as the one for non vectorized code, sometimes in ways that are important, if the differences don't matter to you, something like

-fno-math-errno -fno-trapping-math -fno-signaling-nans -fno-rounding-math

but first look up those options and make sure that they won't affect your program's correctness. -ffinite-math-only may help also

0人赞添加讨论(0) 举报

手持菜刀，她持情操

3楼-- · 2019-04-08 22:07

You don’t necessarily lose precision with -ffast-math. It only affects the handling of NaN, Inf etc. and the order in which operations are performed.

If you have a specific piece of code where you do not want GCC to reorder or simplify computations, you can mark variables as being used using an asm statement.

For instance, the following code performs a rounding operation on f. However, the two f += g and f -= g operations are likely to get optimised away by gcc:

static double moo(double f, double g)                                      
{                                                                          
    g *= 4503599627370496.0; // 2 ** 52                                    
    f += g;                                                                
    f -= g;                                                                
    return f;                                                            
}

On x86_64, you can use this asm statement to instruct GCC not to perform that optimisation:

static double moo(double f, double g)                                      
{                                                                          
    g *= 4503599627370496.0; // 2 ** 52                                    
    f += g;                                                                
    __asm__("" : "+x" (f));
    f -= g;
    return f;
}

You will need to adapt this for each architecture, unfortunately. On PowerPC, use +f instead of +x.

0人赞添加讨论(0) 举报

做自己的国王

4楼-- · 2019-04-08 22:07

Because `-ffast-math` enables operands reordering which allows many code to be vectorized.

For example to calculate this

sum = a[0] + a[1] + a[2] + a[3] + a[4] + a[5] + … a[99]

the compiler is required to do the additions sequentially without -ffast-math, because floating-point math is neither commutative nor associative.

That's the same reason why compilers can't optimize a*a*a*a*a*a to (a*a*a)*(a*a*a) without -ffast-math

That means no vectorization available unless you have very efficient horizontal vector adds.

However if -ffast-math is enabled, the expression can be calculated like this (Look at A7. Auto-Vectorization)

sum0 = a[0] + a[4] + a[ 8] + … a[96]
sum1 = a[1] + a[5] + a[ 9] + … a[97]
sum2 = a[2] + a[6] + a[10] + … a[98]
sum3 = a[3] + a[7] + a[11] + … a[99]
sum’ = sum0 + sum1 + sum2 + sum3

Now the compiler can vectorize it easily by adding each column in parallel and then do a horizontal add at the end

Does sum’ == sum? Only if (a[0]+a[4]+…) + (a[1]+a[5]+…) + (a[2]+a[6]+…) + ([a[3]+a[7]+…) == a[0] + a[1] + a[2] + … This holds under associativity, which floats don’t adhere to, all of the time. Specifying /fp:fast lets the compiler transform your code to run faster – up to 4 times faster, for this simple calculation.

Do You Prefer Fast or Precise? - A7. Auto-Vectorization

It may be enabled by the -fassociative-math flag in gcc

Auto vectorization on double and ffast-math

Because `-ffast-math` enables operands reordering which allows many code to be vectorized.

Further readings

Auto vectorization on double and ffast-math

Because -ffast-math enables operands reordering which allows many code to be vectorized.

Further readings

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间

Because `-ffast-math` enables operands reordering which allows many code to be vectorized.