I have basic understanding of both JSOR and jVerbs.
Both handle limitations of JNI and use fast path to reduce latency. Both of them use user Verbs RDMA interface for avoiding context switch and providing fast path access. Both also have options for zero-copy transfer.
The difference is that JSOR still uses the Java Socket interface. jVerbs provides a new interface. jVerbs also has something called Stateful Verbs Call to avoid repeat serialization of RDMA requests which they say reduces latency. jVerbs provides a more native interface and applications can directly use these. I read the jVerbs SoCC 2013 paper where they build jverbsRPC on top of jVerbs and show that it reduces latency of zookeeper and memcache operations significantly.
Documentations for both show that they perform better than regular Java sockets based on TCP/IP, SDP and IPoIB.
I don't have any performance comparison between JSOR and jVerbs.
I think jVerbs may perform better than JSOR. But, with JSOR, I don't have to change my existing code because it still uses the same java socket interface. My question is what may be the performance gain of using jVerbs relative to JSOR. Does anyone know or have experience in dealing with the two? If you have any comparison data that will be great. I could not find any.
Here are some numbers using DiSNI -- the newly open sourced successor of IBM's jVerbs -- and DaRPC, the low-latency RPC library using DiSNI.
- DiSNI RDMA read latencies for 64 bytes are below 2 microseconds
- DaRPC RDMA send/recv latencies for 64 bytes (request and response) are around 5 microseconds
- The differences betwenn Java/DiSNI and C native RDMA are negligible for one-sided operations
These benchmarks have been executed on two hosts connected using a Mellanox ConnectX-3 network interface.
Here are the commands to execute the benchmarks:
(1) Read benchmark
Server:
java -cp disni-1.0-jar-with-dependencies.jar:disni-1.0-tests.jar com.ibm.disni.examples.benchmarks.AppLauncher -t java-rdma-server -a <address> -o read -s 64 -k 100000 -p
Client:
java -cp disni-1.0-jar-with-dependencies.jar:disni-1.0-tests.jar com.ibm.disni.examples.benchmarks.AppLauncher -t java-rdma-client -a <address> -o read -s 64 -k 100000 -p
(2) Send/recv benchmark
Server:
java -cp darpc-1.0-jar-with-dependencies.jar:darpc-1.0-tests.jar com.ibm.darpc.examples.server.DaRPCServer -a <address> -d -l 64 -r 64
Client:
java -cp darpc-1.0-jar-with-dependencies.jar:darpc-1.0-tests.jar com.ibm.darpc.examples.client.DaRPCClient -a <address> -k 1000000 -l 64 -r 64 -b 1
It is a bit hard to compare performance of jVerbs vs JSOR. The first one is message-oriented API, while the second hides RDMA behind stream-based API of Java sockets.
Here are some stats. My test using a pair of old ConnectX-2 cards and Dell PowerEdge 2970 servers. CentOS 7.1 and Mellanox OFED version 3.1.
I was only interested in latency test.
jVerbs
Test is a variation of RPing sample (can post on github if anybody is interested). Test measured latency of 5000000 cycles of the following sequence of calls over Reliable connection. Message size was 256 bytes.
PostSendMethod.execute()
PollCQMethod.execute()
CompletionChannel.ackCQEvents()
Results (microseconds):
- Median: 10.885
- 99.0% percentile: 11.663
- 99.9% percentile: 17.471
- 99.99% percentile: 27.791
JSOR
Similar test over JSOR socket. Test was a text book client/server socket sample. Message size was 256 bytes as well.
Results (microseconds):
- Median: 43
- 99.0% percentile: 55
- 99.9% percentile: 61
- 99.99% percentile: 217
These results are very far from OFED latency test. On the same hardware/OS standard ib_send_lat benchmark produced 2.77 as median and 23.25 microseconds as maximum latency.