I've been reading: "Tips for creating Context free grammar" post for learning purposes and I nearly understand the concept, but I don't quite understand the following.
If we have:
L = {am bn | m >= n}.
I understand this:
S --> B
B --> aBb
A --> aA
But what I don't understand is the concept of adding to the end of these particular values, such as:
S --> B | ^
B --> aBb | A
A --> aA | a
Why do we add ^ (null), A, and a to the end of these lines? What do they do and why do we need them?
All help is greatly appreciated.
You need them to be able to construct the strings that are in the language L
.
B --> aBb | A
means that if you have a nonterminal B
it can be replaced by either aBb
or A
. (Capital letters represent nonterminals and the lower case letters represent terminals).
Let's take a look at the grammar:
S --> B | ^
B --> aBb | A
A --> aA | a
Why do we have a | ^
? You need it to be able to generate the empty string. The empty string is clearly part of the language L
since it contain an equal amount of a
s and b
s.
Why do we have | A
? To be able to use the rule for A
. Now you can replace B
with A
so you can insert either aA
or a
. You need this rule to be able to generate strings that have more a
s than b
s.
Why do we have | a
? To be able to replace A
with a
without adding a new nonterminal that need to be replaced.
When I look at this grammar I would say that you need to change A --> aA | a
to A --> aA | a | ^
to be able to generate strings with an equal amount of a
s and b
s. (So you can replace A
with the empty string (or null) instead of having to add an extra a
)
Let's say you want to generate the string aaabb
:
S //You start with S
B //Rule: S --> B
aBb //Rule: B --> aBb
aaBbb //Rule: B --> aBb
aaAbb //Rule: B --> A
aaabb //Rule: A --> a
The need for the extra ^,A and a is to complete the grammar set.
Thus A --> aA takes care of [aa, aaa, aaaa, ...]
A --> aA | a takes care of [a, aa, aaa, aaaa, ...]
B --> aBb takes care of [ab, aabb, aaabbb, ...]
So, B --> aBb | A should take care of [a, ab, aa, aab, aaa, aaab, aabb, ...]
The only one missing is ^
So now complete the grammar with S --> B | ^ [^, a, aa, ab, aaa, aab, aaaa, aaab, aabb, ...]
As you can see, it is not possible to generate the 2nd half from the first half, but it is also a subset of the grammar. So, in order to complete the grammar, that 2nd half is crucial as well. Look at the wikipedia article for clearer examples.
http://en.wikipedia.org/wiki/Context-free_grammar