In the last couple of years, I've been doing a lot of SIMD programming and most of the time I've been relying on compiler intrinsic functions (such as the ones for SSE programming) or on programming assembly to get to the really nifty stuff. However, up until now I've hardly been able to find any programming language with built-in support for SIMD.
Now obviously there are the shader languages such as HLSL, Cg and GLSL that have native support for this kind of stuff however, I'm looking for something that's able to at least compile to SSE without autovectorization but with built-in support for vector operations. Does such a language exist?
This is an example of (part of) a Cg shader that does a spotlight and in terms of syntax this is probably the closest to what I'm looking for.
float4 pixelfunction(
output_vs IN,
uniform sampler2D texture : TEX0,
uniform sampler2D normals : TEX1,
uniform float3 light,
uniform float3 eye ) : COLOR
{
float4 color = tex2D( texture, IN.uv );
float4 normal = tex2D( normals, IN.uv ) * 2 - 1;
float3 T = normalize(IN.T);
float3 B = normalize(IN.B);
float3 N =
normal.b * normalize(IN.normal) +
normal.r * T +
normal.g * B;
float3 V = normalize(eye - IN.pos.xyz);
float3 L = normalize(light - IN.pos);
float3 H = normalize(L + V);
float4 diffuse = color * saturate( dot(N, L) );
float4 specular = color * pow(saturate(dot(N, H)), 15);
float falloff = dot(L, normalize(light));
return pow(falloff, 5) * (diffuse + specular);
}
Stuff that would be a real must in this language is:
- Built in swizzle operators
- Vector operations (dot, cross, normalize, saturate, reflect et cetera)
- Support for custom data types (structs)
- Dynamic branching would be nice (for loops, if statements)
Your best bet is probably OpenCL. I know it has mostly been hyped as a way to run code on GPUs, but OpenCL kernels can also be compiled and run on CPUs. OpenCL is basically C with a few restrictions:
- No function pointers
- No recursion
and a bunch of additions. In particular vector types:
float4 x = float4(1.0f, 2.0f, 3.0f, 4.0f);
float4 y = float4(10.0f, 10.0f, 10.0f, 10.0f);
float4 z = y + x.s3210 // add the vector y with a swizzle of x that reverses the element order
On big caveat is that the code has to be cleanly sperable, OpenCL can't call out to arbitrary libraries, etc. But if your compute kernels are reasonably independent then you basically get a vector enhanced C where you don't need to use intrinsics.
Here is a quick reference/cheatsheet with all of the extensions.
It's not really the language itself, but there is a library for Mono (Mono.Simd) that will expose the vectors to you and optimise the operations on them into SSE whenever possible:
So recently Intel released ISPC which is exactly what I was looking for when asking this question. It's a language that can link with normal C code, has and implicit execution model, and support for all the features mentioned in the start post (swizzle operators, branching, data structs, vector ops, shader like) and compiles for SSE2, SSE4, AVX, AVX2, and Xeon Phi vector instructions.
It's a library for C++, rather than built into the language, but Eigen is pretty invisible once your variables are declared.
Currently the best solution is to do it myself by creating a back-end for the open-source Cg frontend that Nvidia released, but I'd like to save myself the effort so I'm curious if it's been done before. Preferably I'd start using it right away.
The D programming language also provides access to SIMD in a similar way than Mono.SIMD.
That would be Fortran that you are looking for. If memory serves even the open-source compilers (g95, gfortran) will take advantage of SSE if it's implemented on your hardware.