I am evaluating OpenCL for my purposes. It occurred to me that you can't assume it working out-of-the-box on either Windows or Mac because:
- Windows needs an OpenCL driver (which, of course, can be installed)
- MacOS supports OpenCL only on MacOS >= 10.6
So I'd have to code FPU/SSE/AVX code and OpenCL separately to produce two binaries: one without and one with OpenCL support.
It would be much better, if I could compile OpenCL at compiletime into SSE/AVX and then ship a binary without OpenCL in it. This way I wouldn't need a separate implementation for FPU/SSE/AVX!
Is there a way to do it?
Cheers,
- clemens
One way to achieve this would be to install the Intel OpenCL SDK on your development machine and use the contained OpenCL Offline Compiler tool to generate Assembly code for your OpenCL kernel and directly compile / link that Assembly code to your application. However, it might be some work to figure out the correct calling conventions etc. in order to call the generated Assembly code manually, and you'd then be yourself responsible for splitting up your work into multiple threads.
Therefore, although that's not what you have been asking for, at least on Windows it would probably easier to ship your application with intelocl.dll and its dependencies (if the license permits that; I have not checked), dynamically load the OpenCL function entry points from there, compile your OpenCL kernel to Assembly on the fly and execute it using the OpenCL runtime.