We have a native c/asm
application utilizing GPU(OpenCL) for a big encrypt/decrypt
data with a specific method, and it just works perfect, no problem. A part of the project (web and distribution) is been developing by JEE
, and we just need to call native application/library.
We have tried to call it as a separated-external process using Process
class. The problem is that we cannot control the application(event, handlers, threads, etc...). We also tried to just switch the C code into Java code, but the performance died. Except running the native code as process, I'm thinking about JNA and JNI, but there are some questions.
Questions:
- For better(faster) read/write solution, is it possible to exchange data by direct(unmanaged) memory [Java(
ByteBuffer#allocateDirect()
)] in both JNI and JNA? - Is it possible to manage and handle process by native code, and access the GPU(shared) memory through Java code(OpenCL lib)?
- What about performance? Is JNA faster than JNI?
We have two AMD W7000 clustered device on Redhat Linux6 x64.
From JNA's official FAQ:
I developed a simple dll and put an empty function which does nothing. Then I called that function from dll with JNA and also JNI, so I tried to calculate cost of calling them. When looking performance after many calls, JNI was 30-40 times faster than JNA.
The heavy number crunching is done in C/GPU, all your Java <--> C interface does is shuffle data in/out. I'd be suprised if this is a bottleneck.
In any case, write the simplest, clearest code that does the job. If it turns out performance isn't enough, measure where the bottlenecks are, and tackle them one by one until performance is OK. Programmer time is much more valuable than computer time, except for very special circumstances.
JNA is much slower than JNI, but much easier. If performance is not an issue use JNA.
Using direct buffers have the advantage that the most critical operations don't use JNI or JNA and are thus faster. They use intrinsic when means they get turned into single machine code instructions.
If Java code is significantly slower than C it is likely the code hasn't been optimised enough. Generally the GPU should be doing all the work so if Java is a bit slower this shouldn't make much difference.
e.g. if you spend 99% of the time in the GPU and Java takes twice as long the total will be 99+2% or 1% slower.