Constant floats with SIMD

2019-04-07 02:56发布

I've been trying my hand at optimising some code I have using microsoft's sse intrinsics. One of the biggest problems when optimising my code is the LHS that happens whenever I want to use a constant. There seems to be some info on generating certain constants (here and here - section 13.4), but its all assembly (which I would rather avoid).

The problem is when I try to implement the same thing with intrinsics, msvc complains about incompatible types etc. Does anyone know of any equivalent tricks using intrinsics?

Example - Generate {1.0,1.0,1.0,1.0}

//pcmpeqw xmm0,xmm0 
__m128 t = _mm_cmpeq_epi16( t, t );

//pslld xmm0,25 
_mm_slli_epi32(t, 25);

//psrld xmm0,2
return _mm_srli_epi32(t, 2);

This generates a bunch of errors about incompatible type (__m128 vs _m128i). I'm pretty new to this, so I'm pretty sure I'm missing something obvious. Can anyone help?

tldr - How do I generate an __m128 vec filled with single precision constant floats with ms intrinsics?

Thanks for reading :)

标签： c++ optimization sse simd

2条回答

戒情不戒烟

2楼-- · 2019-04-07 03:24

Try _mm_set_ps, _mm_set_ps1 or _mm_set1_ps.

0人赞添加讨论(0) 举报

SAY GOODBYE

3楼-- · 2019-04-07 03:33

Simply cast __m128i to __m128 using _mm_castsi128_ps. Also, the second line should be

t = _mm_slli_epi32(t, 25)

0人赞添加讨论(0) 举报

Constant floats with SIMD

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间