What kind of “EventBus” to use in Spring? Built-in

2019-03-07 23:57发布

问题:

We're going to start a new Spring 4 application in a few weeks. And we'd like to use some event-driven architecture. This year I read here and there about "Reactor" and while looking for it on the web, I stumbled upon "Akka".

So for now we have 3 choices:

  • Spring's ApplicationEvent: http://docs.spring.io/spring/docs/4.0.0.RELEASE/javadoc-api/org/springframework/context/ApplicationEvent.html
  • Reactor: https://github.com/reactor/reactor#reactor
  • Akka: http://akka.io/

I couldn't find a real comparison of those.


For now we just need something like:

  • X registers to listen for Event E
  • Y registers to listen for Event E
  • Z sends an Event E

And then X and Y will receive and handle the event.

We will most likely use this in a async way, but for sure there will be also some synchronous scenarios. And we most likely send always a class as event. (The Reactor samples mostly make use of Strings and String patterns, but it also supports Objects).


As far as I understood, ApplicationEvent works synchronous by default and Reactor works the async way. And Reactor also allows to use the await() method to make it kinda synchronous. Akka provides more or less the same as Reactor, but also supports Remoting.

Concerning Reactor's await() method: Can it wait for multiple threads to complete? Or maybe even a partial set of those threads? If we take the example from above:

  • X registers to listen for Event E
  • Y registers to listen for Event E
  • Z sends an Event E

Is it possible to make it synchronous, by saying: Wait for X and Y to complete. And is it possible to make it wait just for X, but not for Y?


Maybe there are also some alternatives? What about for example JMS?

Lot of questions, but hopefully you can provide some answers!

Thank you!


EDIT: Example use cases

  1. When a specific event gets fired, I'd like to create 10000 emails. Every email has to get generated with user specific content. So I'd create a lot of threads (max = system cpu cores) which create the mails and do not block the caller thread, 'cause this can take some minutes.

  2. When a specific event gets fired, I'd like to collect information from an unknown number of services. Each fetch takes about 100ms. Here I could imagine using Reactor's await, 'cause I need those information for continuing my work in the main thread.

  3. When a specific event gets fired, I'd like to perform some operations based on application configuration. So the application must be able to dynamically (un)register comsumers/event handlers. They'll do their own stuff with the Event and I don't care. So I would create a thread for every of those handlers and just continue doing my work in the main thread.

  4. Simple decoupling: I basically know all receivers, but I just don't want to call every receiver in my code. This should mostly get done synchronously.

Sound like I need a ThreadPool or a RingBuffer. Do those frameworks have dynamic RingBuffers, which grow in size if needed?

回答1:

I'm not sure I can adequately answer your question in this small space. But I'll give it a shot! :)

Spring's ApplicationEvent system and Reactor are really quite distinct as far as functionality goes. ApplicationEvent routing is based on the type handled by the ApplicationListener. Anything more complicated than that and you'll have to implement the logic yourself (that's not necessarily a bad thing, though). Reactor, however, provides a comprehensive routing layer that is also very lightweight and completely extensible. Any similarity in function between the two ends at their ability to subscribe and publish events, which is really a feature of any event-driven system. Also don't forget the new spring-messaging module out with Spring 4. It's a subset of the tools available in Spring Integration and also provides abstractions for building around an event-driven architecture.

Reactor will help you solve a couple key problems that you would otherwise have to manage yourself:

Selector matching: Reactor does Selector matching, which encompasses a range of matches--from a simple .equals(Object other) call, to a more complex URI templating match which allows for placeholder extraction. You can also extend the built-in selectors with your own custom logic so you can use rich objects as notification keys (like domain objects, for instance).

Stream and Promise APIs: You mentioned the Promise API already with reference to the .await() method, which is really meant for existing code that expects blocking behavior. When writing new code using Reactor, it can't be stressed highly enough to use compositions and callbacks to effectively utilize system resources by not blocking threads. Blocking the caller is almost never a good idea in an architecture that depends on a small number of threads to execute a large volume of tasks. Futures are simply not cloud-scalable, which is why modern applications leverage alternative solutions.

Your application could be architected with Streams or Promises either one, though honestly, I think you'll find the Stream more flexible. The key benefit is the composability of the API, which allows you to wire actions together in a dependency chain without blocking. As a completely off-the-cuff example based on your email use-case you describe:

@Autowired
Environment env;
@Autowired
SmtpClient client;

// Using a ThreadPoolDispatcher
Deferred<DomainObject, Stream<DomainObject>> input = Streams.defer(env, THREAD_POOL);

input.compose()
  .map(new Function<DomainObject, EmailTemplate>() {
    public EmailTemplate apply(DomainObject in) {
      // generate the email
      return new EmailTemplate(in);
    }
  })
  .consume(new Consumer<EmailTemplate>() {
    public void accept(EmailTemplate email) {
      // send the email
      client.send(email);
    }
  });

// Publish input into Deferred
DomainObject obj = reader.readNext();
if(null != obj) {
  input.accept(obj);
}

Reactor also provides the Boundary which is basically a CountDownLatch for blocking on arbitrary consumers (so you don't have to construct a Promise if all you want to do is block for a Consumer completion). You could use a raw Reactor in that case and use the on() and notify() methods to trigger the service status checking.

For some things, however, it seems like what you want is a Future returned from an ExecutorService, no? Why not just keep things simple? Reactor will only be of real benefit in situations where your throughput performance and overhead effeciency is important. If you're blocking the calling thread, then you're likely going to be wiping away the effeciency gains that Reactor will give you anyway, so you might be better off in that case using a more traditional toolset.

The nice thing about the openness of Reactor is that there's nothing stopping the two from interacting. You can freely mix Futures with Consumers without static. In that case, just keep in mind that you're only ever going to be as fast as your slowest component.



回答2:

Lets ignore the Spring's ApplicationEvent as it really is not designed for what your asking (its more about bean lifecycle management).

What you need to figure out is if you want do it

  1. the object oriented way (ie actors, dynamic consumers, registered on the fly) OR
  2. the service way (static consumers, registered on startup).

Using your example of X and Y are they:

  1. ephemeral instances (1) or are they
  2. long lived singletons/service objects (2)?

If you need to register consumers on the fly than Akka is a good choice (I'm not sure about reactor as I have never used it). If you don't want to do your consuming in ephemeral objects than you can use JMS or AMQP.

You also need to understand that these kind of libraries are trying to solve two problems:

  1. Concurrency (ie doing things in parallel on the same machine)
  2. Distribution (ie doing things in parallel on multiple machines)

Reactor and Akka are mainly focused on #1. Akka just recently added cluster support and the actor abstraction makes it easier to do #2. Message Queues (JMS, AMQP) are focused on #2.

For my own work I do the service route and use a heavily modified Guava EventBus and RabbitMQ. I use annotations similar to the Guava Eventbus but also have annotations for the objects sent on the bus however you can just use Guava's EventBus in Async mode as a POC and then make your own like I did.

You might think that you need to have dynamic consumers (1) but most problems can be solved with a simple pub/sub. Also managing dynamic consumers can be tricky (hence Akka is a good choice because the actor model has all sort of management for this)



回答3:

Carefully define what you want from the framework. If a framework has more features than you need, it is not always good. More features means more bugs, more code to learn, and less performance.

Some features to concern are:

  • the nature of actors (threads or lightweight objects)
  • ability to work on a machine cluster (Akka)
  • persistent message queues (JMS)
  • specific features like signals (events without information), transitions (objects to combine messages from different ports into complex event, see Petri Nets) etc.

Be careful with synchronous features like await - it blocks the whole thread and is dangerous when actors are executed on a thread pool (thread starvation).

More frameworks to look at:

Fork-Join Pool - in some cases, allows await without thread starvation

Scientific workflow systems

Dataflow framework for Java - signals, transitions

ADD-ON: Two kinds of actors.

Generally, parallel working system can be represented as a graph, where active nodes send messages to each other. In Java, as in most other mainstream languages, active nodes (actors) can be implemented either as threads or tasks (Runnable or Callable) executed by a thread pool. Normally, part of actors are threads and part are tasks. Both approaches has their advantages and disadvantages, so it's vital to chose most appropriate implementation for each actor in the system. Briefly, threads can block (and wait for events) but consume much memory for their stacks. Tasks may not block but use shared stacks (of threads in a pool).

If a task calls a blocking operation, it excludes a pooled thread from service. If many tasks block, they can exclude all threads, causing a deadlock - those tasks which can unblock blocked tasks cannot run. This kind of deadlock is called thread starvation. If, in attempt to prevent thread starvation, configure thread pool as unlimited, we simply convert tasks into threads, loosing advantages of tasks.

To eliminate calls to blocking operations in tasks, the task should be split in two (or more) - first task calls blocking operation and exits, and the rest is formatted as an asynchronous task started when the blocking operation finishes. Of course, the blocking operation has to have an alternative asynchronous interface. So, for example, instead of reading socket synchronously, NIO or NIO2 libraries should be used.

Unfortunately, standard java library lacks asynchronous counterparts for popular synchronization facilities like queues and semaphores. Fortunately, the are easy to implement from scratch (see Dataflow framework for Java for examples).

So, making computations purely with non-blocking tasks is possible but increases the size of code. Evident advise is to use threads where possible and tasks only for simple massive computations.