How can I push the current execution state into a

2019-05-07 03:46发布

Imagine a simple grammar:

(a|ab)c

Which reads (a or ab) followed by c. The parse tree would look like this:

   and
   / \
  or  c
 /  \
a   ab

Now given it this input:

abc

We would traverse first down the left side of the tree, and match "a", then go back up a level. Since "a" matched, the "or" is also true, so move on to the "c". "c" does not match, and we hit the end of the road.

But there was an alternate path it could have taken; had we traversed down to "ab", we would have found a match.

So what I want to do for "or" nodes is essentially this:

  1. Evaluate left branch
  2. If left branch doesn't match, try right branch
  3. If left match does match, push current state on to stack so that we can continue from this point later if necessary

Then whenever the parser hits a dead end, I want to pop an item off the stack and continue from there again.

That's the part I can't figure out...how do I essentially save the current call stack? I can save the "ab" node in a stack so that I know I have to execute that one next, but then it still needs to know it needs to fall back up to the "or" afterwards.


I think Chris was on to something. We have to find a way to translate the tree such that it isn't necessary to jump across branches like that. For example, this equivalent parse tree doesn't have that problem:

     or
    /  \
 and   and
 / \    / \
a   c  ab  c

This time we parse down the left, hit "a", it passes, so we try the "c" node beside it, that fails, "and" fails, "or" has to try the right branch, ... "ab" passes, the other "c" passes, and then the whole expression passes.

4条回答
爱情/是我丢掉的垃圾
2楼-- · 2019-05-07 04:29

You have the answer to your question in the way you posed it.

You need to save the state. The tricky part is identifying the state. Saving it is easy.

Your problem is that the parser "has a state" when it starts parsing some grammar rule. (This gets messier if you use an LALR parser, which merges the parsing of many rules into a single state). That state consists of:

  • the state of the input (e.g., where is the input stream?).
  • the state of the parse stack (what is the left context seen so far?)
  • where the parser should continue for success, and where to continue for failure

When you are parsing and face a choice alternative as you have described, you need to "save the state", run a trial parse on the first term. If successful, you can throw away the saved state and continue. If failure, restore the state, and try the 2nd (and nth alternatives). (Its easier to be brainless and just the save state regardless of whether you face an alternative, but that's up to you).

How can you save the state? Push it into a stack. (You typically have a parse stack, that's a pretty convenient place! If you don't like that, add another stack but you'll discover it and the parse stack in general move synchronously; I just make the parse stack contain a record with all the stuff I need, including space for the input. And you'll find the "call stack" convenient for parts of the state; see below).

The first thing is to save the input location; that is likely a file source position and for optimizing reasons likely a buffer index. That's just a scalar so it is pretty easy to save. Restoring the input stream may be harder; there's no gaurantee that the parser input scanner is anywhere near where it was. So you need to reposition the file, re-read any buffer, and reposition any input buffer pointer. Some simple checks can make this statistically cheap: store the file position of the first character of any buffer; then deteimining if you have to re-read the buffer is a matter of comparing the saved file position with the buffer start file position. The rest should be obvious.

You'll backtrack through fewer buffers (e.g, your parser runs faster) if you rearrange your grammar to minimize that. In your specific grammar, you have "(a | ab ) c", which could be re-written by hand to "a b? c". The latter will at least not backtrack across whatever a represents.

The odd part is saving the parse stack. Well, you don't have to, because your trial parse is only going to extend the parse stack you have, and restore it to the parse state you have whether your subparse succeeds or fails.

"where the parser goes on fail" and "where it goes on success" are just two more scalars. You can represent them as indexes of your parsing code blocks, and program counters (e.g., continuations) or as a return address on your call stack (see? another parallel stack!) followed by a conditional test to success/failure.

If you want some details on the latter, check out my SO answer on hand-coded recursive descent parsers.

If you start building trees, or doing something else as a side effect of the parse, you'll have to figure how to capture/save the state of the side-effected entity, and restore it. But whatever it is, you'll end up pushing it on a stack.

查看更多
贪生不怕死
3楼-- · 2019-05-07 04:35

What you should do is is call a method for each possibility. If you hit a dead end, you can return, and you will be right back where you were, and ready to try the next option.

You can indicate whether you successfully parsed the branch by returning a value from the parsing method. For example, you could return true for success and false for failure. In this case, if the parse returns false, you try the next option.

查看更多
我命由我不由天
4楼-- · 2019-05-07 04:38

you need to use recursion.

Something like:

in or statement for each statement bool ret = eval(statement) if (ret) bool recVal = call the recursion if (recVal) than you find a path stop the recursion. else we continue in another loop and try next statement.

查看更多
5楼-- · 2019-05-07 04:48

Just return your state in addition to your result. Lets take a simple example where you can have an index for each element:

Grammer: (a|ab)c
Translated: AND(OR(a,ab),c)
Input: abc

Call AND
Call OR
a matches, return true,1
c does not match, start over
Call OR giving 1
ab matches, return true,2
c matches, return true

You will need a more complex structure to handle harder cases (whether it be a queue or stack is up to how you build and destruct in recreating your state)

查看更多
登录 后发表回答