L = {1i - 1j = 1i-j: i-j >= 0, i,j>=0}
I'm confused about how to construct a grammar that tracks subtraction of a string element. I have no clue on how to get started on this and attempted to work with an equivaent construction of the form
L = {1i = 1i-j + 1j}
Any hints or suggrstions are appreciated.
Whenever I see these kinds of problems, I try to think about the smallest strings in the language and then rules to get bigger strings. The smallest string in this language may be
- =
if we allowi - j, i, j = 0
; I might dispute whether this can be seriously thought of as a proper encoding of0 - 0 = 0
in unary, but I digress. The following derivation works equally well if you requirei - j, i, j > 0
and consider11 - 1 = 1
to be the shortest string in your language. How do we get longer strings in the language?First, note that we can add
1
to the beginning and end of the string and get another string in the language:- =
is a string, and so are1 - = 1
,11 - = 11
, ...,1^n - = 1^n
. This suggests that we will have rules likeS -> 1S1
andS -> T
.Second, note that the difference
1^i - 1^j
remains the same if we increasei
andj
by one each. So, if1^i - 1^j = 1^(i+j)
is in our language, then so is1^(i+1) - 1^(j+1) = 1^(i+j)
. Clearly, we need a rule that allows us to put some1
s in between the-
and the=
, since we don't have one already. This observation says that when we do that, we have to put one before the-
. We might guess a good rule isT -> 1T1 | -
.Considering our grammar so far, we have:
This gives us
1^s 1^t - 1^t 1^s
. This is actually pretty close, but we are missing=
. We see it needs to come afterT
, and in fact, we can just add it to the productionS -> T
to getThis gives us
1^s 1^t - 1^t = 1^s
. Sincei = s + t
,j = t
andi - j = s + t - t = s
as we have on the right of=
, it looks right in that the grammar should only generate strings in the language. But does it generate all strings in the language? We can proceed by mathematical induction on the number of1
s in the strings. Before we begin, note that the number of1
s must always be even, since the equation never holds if just one, or all three, of the strings of1
s have odd length, and those are the only cases where the total number of1
s could be odd.Base Case: there is only one string in
L
with no1
s, namely- =
, and it is derived byS -> T = -> - =
.Induction hypothesis: assume that all strings in
L
with no more thank
instances of1
are generated by the grammar.Induction step: we need to show any string with
k + 2
(remember, the number of1
s in any string inL
must be even by our earlier observation) instances of1
can be generated by the grammar. our string can be written1^a - 1^b = 1^c
wherea + b + c = k + 2
,a - b = c
c <= a
andb <= a
must be true. ifa = b
andc = 0
, then there is a string of lengthk
witha' = a - 1
andb' = b - 1
also inL
; by the induction hypothesis, it is generated by the grammar, and by inspection we see that we can insert another application ofT -> 1T1
in the derivation before concluding it withT -> -
in order to get our string. If, on the other hand,a > b
, then there is a shorter string inL
witha' = a - 1
andc' = c - 1
. By the induction hypothesis, this is generated by our grammar, and by inspection we see we can insert an extra application ofS -> 1S1
into the derivation beforeS -> T =
in order to get our string. In both cases the grammar generates the string of lengthk + 2
, so all strings inL
are generated by the grammar.Here's a tip: always try to think of context-free languages in terms of parenthetic balancing.
Consider the following two languages:
Most of us "see" the first language as a repetition (some a's followed by the same number of b's) and the second one as a nesting. But the context-free grammars for these two languages are identical:
Because the CFG is being parsed with a stack automaton, the nesting is natural. The first language is actually some number of a's followed by the same number of b's, reversed. Of course, some number of b's and some number of b's reversed look identical... until you draw the derivation tree.
So consider the unary addition language:
{ 1i+j = 1i + 1j }
.Clearly that's the same as
{ 1j+i = 1i + 1j }
(switching the order of addition makes no difference). Or, writing out the addition as a simple concatenation:{ 1j 1i = 1i + 1j }
. Now, we can just group the symmetry: (here the parentheses are metasymbols to show grouping, not part of the language):{ 1j ( ( 1i = 1i ) + ) 1j }
.Which leads to
In the CFG, the addition has been obscured a bit, but it's clear what is happening: we're allowing a sentence with the same number of 1s on both sides of an
=
, with a single+
being inserted anywhere on the right-hand side. (With a very simple change to the grammar, we could allow multiple+
signs, making out grammar accept addition equations with more than one addend.)