Constant floats with SIMD

2019-04-07 03:23发布

问题:

I've been trying my hand at optimising some code I have using microsoft's sse intrinsics. One of the biggest problems when optimising my code is the LHS that happens whenever I want to use a constant. There seems to be some info on generating certain constants (here and here - section 13.4), but its all assembly (which I would rather avoid).

The problem is when I try to implement the same thing with intrinsics, msvc complains about incompatible types etc. Does anyone know of any equivalent tricks using intrinsics?

Example - Generate {1.0,1.0,1.0,1.0}

//pcmpeqw xmm0,xmm0 
__m128 t = _mm_cmpeq_epi16( t, t );

//pslld xmm0,25 
_mm_slli_epi32(t, 25);

//psrld xmm0,2
return _mm_srli_epi32(t, 2);

This generates a bunch of errors about incompatible type (__m128 vs _m128i). I'm pretty new to this, so I'm pretty sure I'm missing something obvious. Can anyone help?

tldr - How do I generate an __m128 vec filled with single precision constant floats with ms intrinsics?

Thanks for reading :)

回答1:

Simply cast __m128i to __m128 using _mm_castsi128_ps. Also, the second line should be

t = _mm_slli_epi32(t, 25)


回答2:

Try _mm_set_ps, _mm_set_ps1 or _mm_set1_ps.