I'm running some C# code that uses System.Numerics.Vector<T>
but as far as I can tell I'm not getting the full benefit of SIMD intrinsics. I'm using Visual Studio Community 2015 with Update 1, and my clrjit.dll is v4.6.1063.1.
I'm running on an Intel Core i5-3337U Processor, which implements the AVX instruction set extensions. Therefore, I figure, I should be able to execute most SIMD instructions on a 256 bit register. For example, the disassembly should contain instructions like vmovups
, vmovupd
, vaddups
, etc..., and Vector<float>.Count
should return 8, Vector<double>.Count
should be 4, etc... But that's not what I'm seeing.
Instead my disassembly contains instructions like movups
, movupd
, addups
, etc... and the following code:
WriteLine($"{Vector<byte>.Count} bytes per operation");
WriteLine($"{Vector<float>.Count} floats per operation");
WriteLine($"{Vector<int>.Count} ints per operation");
WriteLine($"{Vector<double>.Count} doubles per operation");
Produces:
16 bytes per operation
4 floats per operation
4 ints per operation
2 doubles per operation
Where am I going wrong? To see all project settings etc. the project is available here.