I have a service that transfers messages at a quite high rate.
Currently it is served by akka-tcp and it makes 3.5M messages per minute. I decided to give grpc a try. Unfortunately it resulted in much smaller throughput: ~500k messages per minute an even less.
Could you please recommend how to optimize it?
My setup
Hardware: 32 cores, 24Gb heap.
grpc version: 1.25.0
Message format and endpoint
Message is basically a binary blob. Client streams 100K - 1M and more messages into the same request (asynchronously), server doesn't respond with anything, client uses a no-op observer
service MyService {
rpc send (stream MyMessage) returns (stream DummyResponse);
}
message MyMessage {
int64 someField = 1;
bytes payload = 2; //not huge
}
message DummyResponse {
}
Problems:
Message rate is low compared to akka implementation.
I observe low CPU usage so I suspect that grpc call is actually blocking internally despite it says otherwise. Calling onNext()
indeed doesn't return immediately but there is also GC on the table.
I tried to spawn more senders to mitigate this issue but didn't get much of improvement.
My findings Grpc actually allocates a 8KB byte buffer on each message when serializes it. See the stacktrace:
java.lang.Thread.State: BLOCKED (on object monitor) at com.google.common.io.ByteStreams.createBuffer(ByteStreams.java:58) at com.google.common.io.ByteStreams.copy(ByteStreams.java:105) at io.grpc.internal.MessageFramer.writeToOutputStream(MessageFramer.java:274) at io.grpc.internal.MessageFramer.writeKnownLengthUncompressed(MessageFramer.java:230) at io.grpc.internal.MessageFramer.writeUncompressed(MessageFramer.java:168) at io.grpc.internal.MessageFramer.writePayload(MessageFramer.java:141) at io.grpc.internal.AbstractStream.writeMessage(AbstractStream.java:53) at io.grpc.internal.ForwardingClientStream.writeMessage(ForwardingClientStream.java:37) at io.grpc.internal.DelayedStream.writeMessage(DelayedStream.java:252) at io.grpc.internal.ClientCallImpl.sendMessageInternal(ClientCallImpl.java:473) at io.grpc.internal.ClientCallImpl.sendMessage(ClientCallImpl.java:457) at io.grpc.ForwardingClientCall.sendMessage(ForwardingClientCall.java:37) at io.grpc.ForwardingClientCall.sendMessage(ForwardingClientCall.java:37) at io.grpc.stub.ClientCalls$CallToStreamObserverAdapter.onNext(ClientCalls.java:346)
Any help with best practices on building high-throughput grpc clients appreciated.