This is a general question regarding Unit Testing Bolts and Spouts in a Storm Topology written in Java.
What is the recommended practice and guideline for unit-testing (JUnit?) Bolts and Spouts?
For instance, I could write a JUnit test for a Bolt
, but without fully understanding the framework (like the lifecycle of a Bolt
) and the Serialization implications, easily make the mistake of Constructor-based creation of non-serializable member variables. In JUnit, this test would pass, but in a topology, it wouldn't work. I fully imagine there are many test points one needs to consider (such as this example with Serialization & lifecycle).
Therefore, is it recommended that if you use JUnit based unit tests, you run a small mock topology (LocalMode
?) and test the implied contract for the Bolt
(or Spout
) under that Topology? Or, is it OK to use JUnit, but the implication being that we have to simulate the lifecycle of a Bolt (creating it, calling prepare()
, mocking a Config
, etc) carefully? In this case, what are some general test points for the class under test (Bolt/Spout) to consider?
What have other developers done, with respect to creating proper unit tests?
I noticed there is a Topology testing API (See: https://github.com/xumingming/storm-lib/blob/master/src/jvm/storm/TestingApiDemo.java). Is it better to use some of that API, and stand up "Test Topologies" for each individual Bolt
& Spout
(and verifying the implicit contract that the Bolt has to provide for, eg - it's Declared outputs)?
Thanks
It turns out to be fairly easy to mock storm objects like OutputDeclarer, Tuple and OutputFieldsDeclarer. Of those, only OutputDeclarer ever sees any side effects so code the OutputDeclarer mock class to be able to answer any tuples and anchors emitted, for example. Your test class can then use instances of those mock classes to easily configure a bolt/spout instance, invoke it and validate the expected side effects.
One approach we have taken is to move most of the application logic out of bolts and spouts and into objects that we use to do the heavy lifting by instantiating and using them via minimal interfaces. Then we do unit testing on those objects and integration testing, although this does leave a gap.
Since version 0.8.1 Storm's unit testing facilities have been exposed via Java:
For an example how to use this API have a look here:
Our approach is to use constructor-injection of a serializable factory into the spout/bolt. The spout/bolt then consults the factory in its open/prepare method. The factory's single responsibility is to encapsulate obtaining the spout/bolt's dependencies in a serializable fashion. This design allows our unit tests to inject fake/test/mock factories which, when consulted, return mock services. In this way we can narrowly unit test the spout/bolts using mocks e.g. Mockito.
Below is a generic example of a bolt and a test for it. I have omitted the implementation of the factory
UserNotificationFactory
because it depends on your application. You might use service locators to obtain the services, serialized configuration, HDFS-accessible configuration, or really any way at all to get the correct services, so long as the factory can do it after a serde cycle. You should cover serialization of that class.Bolt
Test