Earley cannot handle epsilon-states already contai

2019-09-08 04:40发布

I have implemented the Earley parser using a queue to process states. The queue is seeded with the top-level rule. For each state in the queue, one of the operations (prediction, scanning, completion) is performed by adding new states to the queue. Duplicate states are not added.

The problem I am having is best described with the following grammar:

A -> B B; B -> epsilon

When parsing A, the following happens:

enter image description here

As you can tell, A will not be fully resolved. This is because the completion with the epsilon state will only happen once as it is not added to the queue.

How can I adapt my algorithm to support these epsilon-states?

Edit: Note that this is not an issue when using terminals as a new chart set will be created to insert the scanned state. As the state does not exist there already, it will be processed.

1条回答
地球回转人心会变
2楼-- · 2019-09-08 04:59

In the paper "Practical Earley Parsing" by John a Aycock and R. Nigel Horspool hidden in section 4 is a statement on how to handle nullable rules:

If [A→ ... •B ..., j] is in Si, add [B→ • a, i]
to Si for all rules B → a.
If B is nullable, also add [A → ... B • ..., j] to Si

So in your example, in the prediction of A→ • B B the following rules would be produced:

(1) B → •
(2) A → B • B
(3) A → B B •

The key is this happens in the prediction phase. During the prediction phase if the 'post dot' symbol is nullable (both directly and through transference) then move the dot right and add that rule as well.

So basically:

A → • B B produces (B → • and A → B • B) each being queued and processed
A → B • B produces (A → B B •) which is queued and processed

查看更多
登录 后发表回答