Can I temporarily enable FTZ and DAZ floating-poin

2019-02-18 23:57发布

问题:

I'd like to enable temporarily FTZ/DAZ modes to get a performance gain for some code where strict compliance with the IEEE 754 standard is not an issue, without changing the behaviour of other threads, which could be executing code, where that compliance is important.

I've been reading this on how to enable/disable these modes and this on the performance impact of denormals handling, but unfortunately I've got a mixed code in a multithreaded environment and I cannot enable these modes once and for all.

My understanding is that since MXCSR register's flags determine the behavior of the hardware and since every thread has its own context of registers, setting these flags will only affect the behaviour of the current thread.

Is it correct?

回答1:

Yes, MXCSR is part of the per-thread architectural state saved/restored by context switches, along with the xmm/ymm/zmm and x87 stack registers (using xsave/xrstor). Different threads have their own FPU state.


Interesting idea, I'd always figured DAZ was only useful if you had denormal constants or something (or data from a file), but having other threads running without FTZ is another source of denormals.

You might also want to compile some files with -ffast-math, or a subset of those options. Note that linking with -ffast-math in gcc will include a CRT function that sets DAZ/FTZ before main(), so don't do that.

The optimizations enabled by fast-math are mostly orthogonal to whether denormals are flushed to zero. Even just -fno-math-errno lets more math functions inline (better / at-all), e.g. sqrtf, and is totally safe if you don't care about errno being set as well as getting a NaN result.