What are some strategies for testing large state m

2019-03-09 10:23发布

问题:

I inherited a large and fairly complex state machine. It has 31 possible states, all are really needed (big business process). It has the following inputs:

  • Enum: Current State (so 0 -> 30)
  • Enum: source (currently only 2 entries)
  • Boolean: Request
  • Boolean: Type
  • Enum: Status (3 states)
  • Enum: Handling (3 states)
  • Boolean: Completed

Breaking it into separate state machines doesn't seem feasible, as each state is distinct. I wrote tests for the most common inputs, with one test per input, all inputs constant, except for the State.

[Subject("Application Process States")]
public class When_state_is_meeting2Requested : AppProcessBase
{
    Establish context = () =>
    {
        //Setup....
    };

    Because of = () => process.Load(jas, vac);

    It Current_node_should_be_meeting2Requested = () => process.CurrentNode.ShouldBeOfType<meetingRequestedNode>();
    It Can_move_to_clientDeclined = () => Check(process, process.clientDeclined);
    It Can_move_to_meeting1Arranged = () => Check(process, process.meeting1Arranged);
    It Can_move_to_meeting2Arranged = () => Check(process, process.meeting2Arranged);
    It Can_move_to_Reject = () => Check(process, process.Reject);
    It Cannot_move_to_any_other_state = () => AllOthersFalse(process);
}

No one is entirely sure what the output should be for each state and set of inputs. I have started to write tests for it. However, I'll need to write something like 4320 tests (30 * 2 * 2 * 2 * 3 * 3 * 2).

What suggestions do you have for testing state machines?


Edit: I am playing with all of the suggestions, and will mark an answer when I find one that works best.

回答1:

I see the problem, but I'd definitely try splitting the logic out.

The big problem area in my eyes is:

  • It has 31 possible states to be in.
  • It has the following inputs:
    • Enum: Current State (so 0 -> 30)
    • Enum: source (currently only 2 entries)
    • Boolean: Request
    • Boolean: type
    • Enum: Status (3 states)
    • Enum: Handling (3 states)
    • Boolean: Completed

There is just far too much going on. The input is making the code hard to test. You've said it would be painful to split this up into more manageable areas, but it's equally if not more painful to test this much logic in on go. In your case, each unit test covers far too much ground.

This question I asked about testing large methods is similar in nature, I found my units were simply too big. You'll still end up with many tests, but they'll be smaller and more manageable, covering less ground. This can only be a good thing though.

Testing Legacy Code

Check out Pex. You claim you inherited this code, so this is not actually Test-Driven-Development. You simply want unit tests to cover each aspect. This is a good thing, as any further work will be validated. I've personally not used Pex properly yet, however I was wowed by the video I saw. Essentially it will generate unit tests based on the input, which in this case would be the finite state machine itself. It will generate test cases you will not have enough thought of. Granted this is not TDD, but in this scenario, testing legacy code, it should be ideal.

Once you have your test coverage, you can begin refactoring, or adding new features with the safety of good test coverage to ensure you don't break any existing functionality.



回答2:

All-Pair Testing

To constraint the amount of combinations to test and to be reasonable assured you have most important combinations covered, you should take a look at all-pair testing.

the reasoning behind all-pairs testing is this: the simplest bugs in a program are generally triggered by a single input parameter. The next simplest category of bugs consists of those dependent on interactions between pairs of parameters, which can be caught with all-pairs testing.1 Bugs involving interactions between three or more parameters are progressively less common2, whilst at the same time being progressively more expensive to find by exhaustive testing, which has as its limit the exhaustive testing of all possible inputs.

Also take a look at a previous answer here (shameless plug) for additional information and links to both all-pair & pict as tool.

Example Pict model file

Given model generates 93 testcases, covering all pairs of input parameters.

#
# This is a PICT  model for testing a complex state machine at work 
#

CurrentState  :0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
Source        :1,2
Request       :True, False
Type          :True, False
Status        :State1, State2, State3
Handling      :State1, State2, State3
Completed     :True,False

#
# One can add constraints to the model to exclude impossible 
# combinations if needed.
#
# For example:
# IF [Completed]="True" THEN CurrentState>15;
#

#
# This is the PICT output of "pict ComplexStateMachine.pict /s /r1"
#
# Combinations:    515
# Generated tests: 93


回答3:

I can't think of any easy way to do test an FSM like this with out getting really pedantic and employing proofs, using machine learning techniques, or brute force.

Brute force: Write a something that will generate all the 4320 test cases in some declarative manner with mostly incorrect data. I would recommend putting this in a CSV file and then use something like NUnits parameteric testing to load all the test cases. Now most of these test cases will fail so you will have to update the declarative file manually to be correct and take just a sample of the test cases randomly to fix.

Machine Learning technique: You could employ some Vector machines or MDA algorithms/heuristics to try to learn on the sample you took from what we mentioned above and teach your ML program your FSM. Then run the algorithm on all the 4320 inputs and see where the two disagree.



回答4:

How many test do you think is needed to "completely" test function sum(int a, int b)? In c# it would be something like 18446744056529682436 tests... Much worse than in your case.

I would suggest following:

  1. Test most possible situations, boundary conditions.
  2. Test some critical parts of your SUT separately.
  3. Add test cases when bugs found by QA or in production.

In this particular case the best way is to test how system switches from one state to onother. Create DSL to test state machine and implement most frequent use cases using it. For Example:

Start
  .UploadDocument("name1")
  .SendDocumentOnReviewTo("user1")
  .Review()
  .RejectWithComment("Not enough details")
  .AssertIsCompleted()

The example of creating simple tests for flows is here: http://slmoloch.blogspot.com/2009/12/design-of-selenium-tests-for-aspnet_09.html



回答5:

Use SpecExplorer or NModel.



回答6:

I had constructed a finite state machine for a piece of medical equipment. The FSM was configurable through an XML format I had defined.

To define a state-machine, one has to rely on experience on digital circuit designs of using state maps,

You have to use what I term as a turnpike transition map. In the United States East Coast, most highways are nicknamed turnpikes. Turnpike authorities issue a turnpike toll pricing map. If a toll section had 50 exits, the pricing map would have a 50rows x 50cols table, listing the exits exhaustively as both rows and columns. To find out the toll charge for entering exit 20 and exiting exit 30, you simply look for the intersect of row 20 and column 30.

For a state machine of 30 states, the turnpike transition map would be a 30 x 30 matrix listing all the 30 possible states row and column wise. Let us decide the rows to be CURRENT states and columns to be NEXT states.

Each intersecting cell would list the "price" of transitioning from a CURRENT state(row) to a NEXT state(col). However instead of a single $ value, the cell would refer to a row in the Inputs table, which we could term as the transition id.

In the medical equipment FSM I developed, there were inputs that are String, enums, int, etc. The Inputs table listed these input stimulus column-wise.

To construct the Inputs table, you would write a simple routine to list all possible combinations of inputs. But the table would be huge. In your case, the table would have 4320 rows and hence 4320 transition ids. But its not a tedious table because you generated the table programmatically. In my case, I wrote a simple JSP to list the transitions input table (and the turnpike table) on browser or download as csv to be displayed in MS Excel.

There are two directions in constructing these two tables.

  1. the design direction, where you construct the turnpike table all possible transitions, graying out non-reachable transitions. Then consturct the Inputs table of all expected inputs for each reachable transition only, with the row number as transition id. Each transition id is transcribed onto the respective cell of the turnpike transition map. However, since the FSM is a sparse matrix, not all transition ids will be used in the cells of the turnpike transition map. Also, a transition id can be used many multiple times because the same transition conditions can apply to more than one pair of state change.

  2. the test direction is reverse, where you construct the Inputs table. You have to write a general routine for the exhaustive transition test.
    The routine would first read a transition sequencing table to bring the state-machine it to an entrypoint state to start a test cycle. At each CURRENT state, it is poised to run through all 4320 transition ids. On each row of CURRENT states in the Turnpike transition map, there would be a limited number of columns valid NEXT states.

You would want the routine to cycle thro all 4320 rows of inputs that it reads from the Inputs table, to ensure unused transition ids have no effect on a CURRENT state. You want to test that all effectual transition ids are valid transitions.

But you cannot - because once an effectual transition is pumped in, it would change the state of the machine into a NEXT state and prevent you from completing testing the rest of the transition ids on that previous CURRENT state. Once the machine changes state, you have to start testing from transition id 0 again.

Transition paths can be cyclical or irreversible or having combination of cyclical and irreversible sections along the path.

Within your test routine, you need a register for each state to memorise the last transition id pumped into that state. Everytime the test reaches an effectual transition id, that transition id is left in that register. So that when you complete a cycle and return to an already traversed state, you start iterating on next transition id greater than the one stored in the register.

Your routine would have to take care of the irreversible sections of a transition path, wheb a machine is brought to a final state, it restarts the test from the entry point state, reiterating the 4320 inputs from the next transition id greater than the one stored for a state. In this way, you would be able to exhaustively discover all the possible transition paths of the machine.

Fortunately, FSMs are sparse matrices of effectual transitions because exhaustive testing would not consume the complete combination of number of transition ids x number of possible states squared. However, the difficulty occurs if you are dealing with a legacy FSM where visual or temperature states cannot be fed back into the test system, where you have to monitor each state visually. That would be ugly, but still we spent two weeks additionally testing the equipment visually going through only the effectual transitions.

You may not need a transition sequencing table (for each entry point state for the test routine to read to bring the machine to a desired entrypoint) if your FSM allows you to reach an entrypoint with a simple reset and applying a transition id would simply it to an entrypoint state. But having your routine capable of reading a transition sequencing table is useful because frequently, you would need to go into the midst of the state network and start your testing from there.

You should acquaint yourself with the use of transition and state maps because it is very useful to detect all the possible and undocumented states of a machine and interview users if they actually wanted them grayed out (transitions made ineffectual and states made unreachable).

The advantage I had was that it was a new piece of equipment and I had the choice to design the state-machine controller to read xml files which means I could change the behaviour of the state machine anyway I wanted, actually anyway the customer wanted and I was able to assure that unused transition ids were really ineffectual.

For the java listing of the finite state machine controller http://code.google.com/p/synthfuljava/source/browse/#svn/trunk/xml/org/synthful. Test routines not included.



回答7:

Test based on the requirements. If a certain state is required to move to a certain other state whenever Completed is true, then write a test that automatically cycles through all the combinations of the other inputs (this should just be a couple for loops) to prove that the other inputs are correctly ignored. You should end up with one test for each transition arc, which I'd estimate would be somewhere on the order of 100 or 150 tests, not 4000.



回答8:

You might consider investigating Model Based Testing. There are a few tools available to help with test generation in situations like this. I usually recommend MBT.



回答9:

Brute force with coverage tests seems to be a very beginning.