I’m trying to understand the specific AVX512F instruction vcvtps2udq.
The signature of the instruction is VCVTPS2UDQ zmm1 {k1}{z}, zmm2/m512/m32bcst{er}
. The manual info is below.
In an attempt to understand the new rounding modes, the following code snippet is assembled with NASM (2.12.02)
vcvtps2udq zmm0,zmm1
vcvtps2udq zmm0,zmm1,{rz-sae}
vcvtps2udq xmm0,xmm1
Deassembling the results with NDISASM (2.12.02) gives a lot of confusion and the following codes:
62F17C4879C1 vcvtps2udq zmm0,zmm1
62F17C7879C1 vcvtps2udq xmm0,xmm1
62F17C0879C1 vcvtps2udq xmm0,xmm1
Question: the second line is deassembled with xmm registers instead of a zmm register (that I would have expected). Has the zero rounding mode (rz-sae) something to do with it. Or is just NDISASM wrong and cannot distinguish between opcodes 62F17C7879C1 and 62F17C0879C1.
The Intel instruction set reference manual has the following description:
Converts sixteen packed single-precision floating-point values in the source operand to sixteen unsigned doubleword integers in the destination operand.
When a conversion is inexact, the value returned is rounded according to the rounding control bits in the MXCSR register or the embedded rounding control bits. If a converted result cannot be represented in the destination format, the floating-point invalid exception is raised, and if this exception is masked, the integer value 2w – 1 is returned, where w represents the number of bits in the destination format.
The source operand is a ZMM/YMM/XMM register, a 512/256/128-bit memory location, or a 512/256/128-bit vector broadcasted from a 32-bit memory location. The destination operand is a ZMM/YMM/XMM register conditionally updated with writemask k1.