This question may sound cliched, but I am in a situation here.
I am trying to implement a finite state automaton to parse a certain string in C. As I started writing the code, I realised the code may be more readable if I used labels to mark the different states and use goto to jump from one state to another as the case comes.
Using the standard breaks and flag variables is quite cumbersome in this case and hard to keep track of the state.
What approach is better? More than anything else I am worried it may leave a bad impression on my boss, as I am on an internship.
I would recommend you the "Dragon book": Compilers, Principles-Techniques-Tools from Aho, Sethi and Ullman. (It is rather expensive to buy but you for sure will find it in a library). There you will find anything you will need to parse strings and build finite automatons. There is no place I could find with a
goto
. Usually the states are a data table and transitions are functions likeaccept_space()
There is nothing inherently wrong with
goto
. The reason they are often considered "taboo" is because of the way that some programmers (often coming from the assembly world) use them to create "spaghetti" code that is nearly impossible to understand. If you can usegoto
statements while keeping your code clean, readable, and bug-free, then more power to you.Using
goto
statements and a section of code for each state is definitely one way of writing a state machine. The other method is to create a variable that will hold the current state and to use a switch statement (or similar) to select which code block to execute based on the value of the state variable. See Aidan Cully's answer for a good template using this second method.In reality, the two methods are very similar. If you write a state machine using the state variable method and compile it, the generated assembly may very well resemble code written using the
goto
method (depending on your compiler's level of optimization). Thegoto
method can be seen as optimizing out the extra variable and loop from the state variable method. Which method you use is a matter of personal choice, and as long as you are producing working, readable code I would hope that your boss wouldn't think any different of you for using one method over the other.If you are adding this code to an existing code base which already contains state machines, I would recommend that you follow whichever convention is already in use.
Avoid
goto
unless the complexity added (to avoid) is more confusing.In practical engineering problems, there's room for goto used very sparingly. Academics and non-engineers wring their fingers needlessly over using
goto
. That said, if you paint yourself into an implementation corner where a lot ofgoto
is the only way out, rethink the solution.A correctly working solution is usually the primary objective. Making it correct and maintainable (by minimizing complexity) has many life cycle benefits. Make it work first, and then clean it up gradually, preferably by simplifying and removing ugliness.
I don't know your specific code, but is there a reason something like this:
wouldn't work for you? It doesn't use
goto
, and is relatively easy to follow.Edit: All those
State =
fragments violate DRY, so I might instead do something like:which is, of course, much longer than the first attempt (funny thing about DRY). But it's also more robust - failure to return the state from one of the state functions will result in a compiler warning, rather than silently ignore a missing
State =
in the earlier code.I can't see much of a difference between goto and switch. I might prefer switch/while because it gives you a place guaranteed to execute after the switch (where you could throw in logging and reason about your program). With GOTO you just keep jumping from label to label, so to throw in logging you'd have to put it at every label.
But aside from that there shouldn't be much difference. Either way, if you didn't break it up into functions and not every state uses/initializes all local variables you may end up with a mess of almost spaghetti code not knowing which states changed which variables and making it very difficult to debug/reason about.
As an aside, can you maybe parse the string using a regular expression? Most programming languages have libraries that allow using them. The regular expressions often create an FSM as part of their implementation. Generally regular expressions work for non arbitrarily nested items and for everything else there is a parser generator(ANTLR/YACC/LEX). It is generally much easier to maintain a grammar/regex than the underlying state machine. Also you said you were on an internship, and generally they might give you easier work than say a senior developer, so there is a strong chance that a regex may work on the string. Also regular expressions generally aren't emphasized in college so try using Google to read up on them.
Using a
goto
for implementing a state machine often makes good sense. If you're really concerned about using a goto, a reasonable alternative is often to have astate
variable that you modify, and aswitch
statement based on that: