Anyone out there who has done RDMA programming using the RDMA_CM library?
I'm having a hard time finding even simple examples to study. There's an rdma_client & rdma_server example in librdmacm, but it doesn't run in a loop (rping does loop, but it's written using IB verbs directly instead of rdma_cm functions).
I've put together a trivial ping-pong program, but it locks up anywhere after 1 - 100 bounces. I found adding a sleep inside the client makes it work longer before hanging, which indicates a race condition.
The client gets stuck in rdma_get_send_comp() and the server gets stuck in rdma_get_recv_comp().
My limited understanding is that before every rdma_post_send(), you need to issue a rdma_post_recv() that will be coming after the send. Also before every send (except for the 1st client send), you need to wait for a message (rdma_get_recv()) indicating the other side is ready to receive.
What could be wrong?
Server(rdma_cm_id *id)
{
ibv_wc wc;
int ret;
uint8_t recvBuffer[MESSAGE_BUFFER_SIZE],
sendBuffer[MESSAGE_BUFFER_SIZE];
ibv_mr *recvMemRegion = rdma_reg_msgs(id, recvBuffer, MESSAGE_BUFFER_SIZE);
if (!recvMemRegion)
throw 0;
ibv_mr *sendMemRegion = rdma_reg_msgs(id, sendBuffer, MESSAGE_BUFFER_SIZE);
if (!sendMemRegion)
throw 0;
if (ret = rdma_post_recv(id, NULL, recvBuffer, 1, recvMemRegion))
throw 0;
if (ret = rdma_accept(id, NULL))
throw 0;
do
{
if ((ret = rdma_get_recv_comp(id, &wc)) <= 0)
throw 0;
if (ret = rdma_post_recv(id, NULL, recvBuffer, 1, recvMemRegion))
throw 0;
if (ret = rdma_post_send(id, NULL, sendBuffer, 1, sendMemRegion, 0))
throw 0;
printf(".");
fflush(stdout);
if ((ret = rdma_get_send_comp(id, &wc)) <= 0)
throw 0;
}
while (true);
}
Client() // sends the 1st message
{
// queue-pair parameters are:
attr.cap.max_send_wr = attr.cap.max_recv_wr = 4;
attr.cap.max_send_sge = attr.cap.max_recv_sge = 2;
attr.cap.max_inline_data = 16;
attr.qp_context = id;
attr.sq_sig_all = 1;
attr.qp_type = IBV_QPT_RC;
<create connection boiler plate>
uint8_t recvBuffer[MESSAGE_BUFFER_SIZE],
sendBuffer[MESSAGE_BUFFER_SIZE];
recvMemRegion = rdma_reg_msgs(id, recvBuffer, MESSAGE_BUFFER_SIZE);
if (!recvMemRegion)
throw 0;
sendMemRegion = rdma_reg_msgs(id, sendBuffer, MESSAGE_BUFFER_SIZE);
if (!sendMemRegion)
throw 0;
if (ret = rdma_connect(id, NULL))
throw 0;
do
{
if (ret = rdma_post_recv(id, NULL, recvBuffer, 1, recvMemRegion))
throw 0;
//usleep(5000);
if (ret = rdma_post_send(id, NULL, sendBuffer, 1, sendMemRegion, 0))
throw 0;
if ((ret = rdma_get_send_comp(id, &wc)) <= 0)
throw 0;
if ((ret = rdma_get_recv_comp(id, &wc)) <= 0)
throw 0;
printf(".");
fflush(stdout);
}
while (true);
}
Curses! I was bit by a bug in librdmacm-1.0.15-1 (from 2012) that came with SUSE 11. I knew there was nothing wrong with my send/recv sequencing.
I first tried comparing my code with other examples. In one example I saw
instead of rdma_get_send_comp() and likewise for rdma_get_recv_comp(). I tried replacing those in my example and miraculously, the hanging is gone!
Hmm, maybe rdma_get_send_comp() isn't doing what I'm expecting. I'd better take a look at the code. I got the code for both 1.0.15 and 1.0.18 and what do I see in rdma_verbs.h?
2 very different IB verb sequences:
Can anyone explain why 1.0.18 works while 1.0.15 randomly hangs?