Hi all :)
I'm trying to get a hang on a few concepts regarding floating point, SIMD/math intrinsics and the fast-math flag for gcc. More specifically, I'm using MinGW with gcc v4.5.0 on a x86 cpu.
I've searched around for a while now, and that's what I (think I) understand at the moment:
When I compile with no flags, any fp code will be standard x87, no simd intrinsics, and the math.h functions will be linked from msvcrt.dll.
When I use mfpmath, mssen and/or march so that mmx/sse/avx code gets enabled, gcc actually uses simd instructions only if I also specify some optimization flags, like On or ftree-vectorize. In which case the intrinsics are chosen automagically by gcc, and some math functions (I'm still talking about the standard math funcs on math.h) will become intrinsics or optimized out by inline code, some others will still come from the msvcrt.dll. If I don't specify optimization flags, does any of this change?
When I use specific simd data types (those available as gcc extensions, like v4si or v8qi), I have the option to call intrinsic funcs directly, or again leave the automagic decision to gcc. Gcc can still chose standard x87 code if I don't enable simd instructions via the proper flags. Again, if I don't specify optimization flags, does any of this change?
Plese correct me if any of my statements is wrong :p
Now the questions:
- Do I ever have to include x86intrin.h to use intrinsics?
- Do I ever have to link the libm?
- What fast-math has to do with anything? I understand it relaxes the IEEE standard, but, specifically, how? Other standard functions are used? Some other lib is linked? Or are just a couple of flags set somewhere and the standard lib behaves differently?
Thanks to anybody who is going to help :D