The mulx
instruction was introduced with the BMI2 instruction set starting with the Haswell processor.
According to Intel's documentation there should be an intrinsic for mulx
unsigned __int64 umul128(unsigned __int64 a, unsigned __int64 b, unsigned __int64 * hi);
However, I find no such intrinsic from Intel's intrinsic guide online under BMI2 or in general. I do however find the addcarry intrinsics from the ADX instruction set.
According to this link the intrinsic is mulx_u64
but I don't find that one either.
MSVC added a _umul128 intrinsic in MSVC 2005 but that only produces mul
and not mulx
(and I have no idea how to enable BMI2 in MSVC).
I can produce the mulx
instruction indirectly using __int128
in GCC with -mbmi2
(or -march=haswell
) but I would prefer to do this more directly using an intrinsic.
Why do the ADX intrinsics exist but not one for mulx
as defined in Intel's documentation?