Observable to batch like Lmax Disruptor

Those who are familiar with lmax ring buffer (disruptor) know that one of the biggest advanatages of that data structure is that it batches incomming events and when we have a consumer that can take advantage of batching that makes the system automatically adjustable to the load, the more events you throw at it the better.

I wonder couldnt we achieve the same effect with an Observable (targeting the batching feature). I've tried out Observable.buffer but this is very different, buffer will wait and not emit the batch while the expected number of events didnt arrive. what we want is quite different.

given the subsriber is waiting for a batch from Observable<Collection<Event>>, when a single item arrives at stream it emits a single element batch which gets processed by subscriber, while it is processing other elements are arriving and getting collected into next batch, as soon as subscriber finishes with the execution it gets the next batch with as many events as had arrived since it started last processing...

So as a result if our subscriber is fast enough to process one event at a time it will do so, if load gets higher it will still have the same frequency of processing but more events each time (thus solving backpressure problem)... unlike buffer which will stick and wait for batch to fill up.

Any suggestions? or shall i go with ring buffer?

标签： java reactive-programming rx-java observable disruptor-pattern

1条回答

Animai°情兽

2楼-- · 2019-07-20 17:37

RxJava and Disruptor represent two different programming approaches.

I'm not experienced with Disruptor but based on video talks, it is basically a large buffer where producer emit data like a firehose and consumers spin/yield/block until data is available.

RxJava, on the other hand, aims at non-blocking event delivery. We too have ringbuffers, notably in observeOn which acts as the async-boundary between producers and consumers, but these are much smaller and we avoid buffer overflows and buffer bloat by applying the co-routines approach. Co-routines boil down to callbacks sent to your callbacks so yo can callback our callbacks to send you some data at your pace. The frequency of such requests determines the pacing.

There are data sources that don't support such co-op streaming and require one of the onBackpressureXXX operators that will buffer/drop values if the downstream doesn't request fast enough.

If you think you can process data in batches more efficiently than one-by-one, you can use the buffer operator which has overloads to specify time duration for the buffers: you can have, for example, 10 ms worth of data, independent of how many values arrive in this duration.

Controlling the batch-size via request frequency is tricky and may have unforseen consequences. The problem, generally, is that if you request(n) from a batching source, you indicate you can process n elements but the source now has to create n buffers of size 1 (because the type is Observable<List<T>>). In contrast, if no request is called, the operator buffers the data resulting in longer buffers. These behaviors introduce extra overhead in the processing if you really could keep up and also has to turn the cold source into a firehose (because otherwise what you have is essentially buffer(1)) which itself can now lead to buffer bloat.

0人赞添加讨论(0) 举报

Observable to batch like Lmax Disruptor

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间