So far I have managed to find out that:
- SSE and SSE2 are mandatory for Windows 8 and later (and of course for any 64-bit OS)
- AVX is only supported by Windows 7 SP1 or later
Are there any caveats regarding using SSE3, SSSE3, SSE4.1, SSE 4.2, AVX2 and AVX-512 on Windows?
Some clarification: I need this to determine what OSs will my program run on if I use instructions from one of the SSE/AVX sets.
Extensions that introduce new architectural state require special OS support, because the OS has to save/restore restore more data on context switches. So from the OSes perspective, there's nothing extra it needs to do to let user-space code run SSSE3 instructions, if the OS supports SSE.
SSE, AVX, and AVX512 are the extensions that introduced new architectural state.
You check for CPU support for SSE or AVX the usual way, with the CPUID instruction.
To prevent silent data corruption when using a new extension on a multi-tasking OS that doesn't save/restore the new architectural state on context switches, SSE instructions fault as illegal instructions if the OS hasn't set an OS-support bit in a control register. So vector extensions "don't work" on OSes that don't know about saving/restoring the necessary state for that extension.
For SSE, there may not be any clean OS-independent way to detect that the OS has promised to save/restore SSE state on context switches by setting the
CR4.OSFXSR
,CR4.OSXMMEXCPT
etc. bits, because even reading a control register is privileged, and there's no CPUID bit that reflects the setting. On Windows, you can useGetEnabledXStateFeatures
to check that the OS knows how to save/restore the vector register state. I'm not sure what you need on other OSes; SSE support is so widespread that you'd have to be using a really ancient version (or homebrew) OS for this to be a problem.For AVX, we don't need OS support to detect that AVX is usable (supported by hardware and enabled by the OS): CPUID includes an OSXSAVE feature bit that will be set when the OS has enabled AVX.
From Intel's intro to AVX:
It may be easier to call an OS-provided function to detect OS support, instead of using inline asm or a feature-detect library to do all this.
The same is true for AVX512: you can check the CPUID feature bit for the instruction set, and check that the OS has promised to manage the new architectural state on context switches by enabling the right bits in with XSETBV. (So you should check with XGETBV).