I'm creating a Brainfuck parser (in a BASIC dialect) ultimately to create an interpreter but i've realise it's not as straight forward as i first thought. My problem is that i need a way to accurately parse the matching loop operators within a Brainfuck program. This is an example program:
,>,>++++++++[<------<------>>-]
<<[>[>+>+<<-]>>[<<+>>-]<<<-]
>>>++++++[<++++++++>-],<.>.
'[' = start of loop
']' = end of loop
I need to record the start and end point of each matching loop operator so i can jump around the source as needed. Some loops are alone, some are nested.
What would be the best way to parse this? I was thinking maybe move through the source file creating a 2D array (or such like) recording the start and end positions of each matching operator, but this seems like a lot of 'to'ing and fro'ing' through the source. Is this the best way to do it?
More info: Brainfuck homepage
EDIT: Sample code in any language greatly appreciated.
Have you considered using a Stack data structure to record "jump points" (i.e. the location of the instruction pointer).
So basically, every time you encounter a "[" you push the current location of the instruction pointer on this stack. Whenever you encounter a "]" you reset the instruction pointer to the value that's currently on the top of the stack. When a loop is complete, you pop it off the stack.
Here is an example in C++ with 100 memory cells. The code handles nested loops recursively and although it is not refined it should illustrate the concepts..
EDIT I just went over the code again and I realized there was a bug in the while loop that would 'skip' parsed loops if the value of the pointer is 0. This is where I made the change:
Below is an implementation of the same parser but without using recursion:
The canonical method for parsing a context-free grammar is to use a stack. Anything else and you're working too hard and risking correctness.
You may want to use a parser generator like cup or yacc, as a lot of the dirty work is done for you, but with a language as simple as BF, it may be overkill.
And here's the same code I gave as an example earlier in C++, but ported to VB.NET. I decided to post it here since Gary mentioned he was trying to write his parser in a BASIC dialect.
This question is a bit old, but I wanted to say that the answers here helped me decide the route to take when writing my own Brainf**k interpreter. Here's the final product: