I followed the explanation given in the "Precedence climbing" section on this webpage to implement an arithmetic evaluator using the precedence climbing algorithm with various unary prefix and binary infix operators. I would also like to include ternary operators (namely the ternary conditional operator ?:
).
The algorithm given on the webpage uses the following grammar:
E --> Exp(0)
Exp(p) --> P {B Exp(q)}
P --> U Exp(q) | "(" E ")" | v
B --> "+" | "-" | "*" |"/" | "^" | "||" | "&&" | "="
U --> "-"
How can I incorporate ternary operators into this grammar?
To be specific, I'll use C/C++/Java's ?:
as an example.
It appears that in those languages the ?:
operator allows any valid expression between ?
and :
, including expressions formed with ?:
itself and those formed with operators, whose precedence is lower than the precedence of ?:
, e.g. =
and ,
(examples: a ? b = c : d
, a ? b , c : d
, a ? b ? c : d : e
).
This suggests that you should treat ?
and :
in exactly the same manner as (
and )
while parsing expressions. When you have parsed out ? expr :
, it, as a whole, is a binary operator. So, you parse ( expr )
and ? expr :
in the same way, but the former is an expression while the latter is a binary operator (with an expression embedded in it).
Now that ? expr :
is a binary operator (a ?-expr-: b
being no different than a * b
in terms of "binaryness"), you should be able to support it pretty much like any other binary operator that you already support.
Personally, I'd not take the trouble to split ?:
into separate binary operators of their own. In the end, it's still a ternary operator and it must be linked to 3 operands and considered as a whole during expression evaluation. If you are following the approach in the article that you mentioned in the question and are creating an AST, then there you go, ?:
has a left child node, a right child node (as any other binary operator) and, additionally, a middle child node.
The precedence of ?-expr-:
as a whole should be a low one. In C/C++ (and in Java?) it's not the lowest, though. It's up to you to decide what you want it to be.
So far we haven't covered the associativity of ?:
. In C/C++ and Java, ?-expr-:
is right-associative just like the assignment operator =
. Again, it's up to you to make it left-associative or keep it right-associative.
And this:
E --> P {B P}
P --> v | "(" E ")" | U P
B --> "+" | "-" | "*" | "/" | "^"
U --> "-"
should become something like this:
E --> P {B P}
P --> v | "(" E ")" | U P
B --> "+" | "-" | "*" | "/" | "^" | "?" E ":"
U --> "-"
I've come across this question searching for information on transforming ternary operators to Reverse Polish Notations (RPN), so while I do not have a solid implementation for implementing the ?:
operator with Precedence Climbing, I do think I may be able to give some clue to transforming the ?:
operator to RPN using two stacks ... which is similar to the Shunting Yard algorithm in the webpage you have given.
(Edit) I have to point out what I'm doing here is not very efficient (both branches of the ternary operator must be evaluated), but to demonstrate how easy it is to incorporate a new operator (?:
) in an existing framework (the RPN transformation code) (by declaring ?
and :
with two lowest precedence levels).
I want to start with some examples of how expressions with ?:
are transformed to RPN based on my parser...
I am adding two operators instead of just one, the ?
and :
, but I don't think it makes a big difference since :
and ?
will always be put together (It's just more convenient that the RPN and the original expression have the same number of tokens). In the examples you can see how it works.
Example 1: 1 + ((0+1) ? 2 : 3 + 8) + 4
RPN: 1
0
1
+
2
3
8
+
:
?
+
4
+
Now let's evaluate the RPN.
1 - Pushing first elements into the stack and we come across the first binary operator:
1
0
1
and operator +
, we add the top 2 elements, turning the stack into 1
1
2 - Then pushing three more elements and we come across the second binary operator:
1
1
2
3
8
+
------> 1
1
2
11
3 - Now we have :
and ?
. This actually tells us to select either the consequent branch (the 2nd topmost element on top of the stack, 2
) or the alternative branch (the topmost element on the stack, 11
), based on the predicate (the 3rd topmost element on the stack, 1
). Since the predicate is 1 (true), we choose 2 and discard 11. The parser pops the 3 elements (predicate/consequent/alternative) and pushes back the chosen one (in this case, the consequent branch), so the stack becomes
1
2
4 - Consume the remaining elements:
1
2
+
------> 3
3
4
+
------> 7
Example 2: 1 + ((0+0+0?0:0) ? 2 : (3 + 8)) + 4
RPN: 1
0
0
+
0
+
0
0
:
?
2
3
8
+
:
?
+
4
+
Evaluation:
Start:
1
0
0
+
0
+
0
0
:
?
2
3
8
+
:
?
+
4
+
After adding 0 to 0:
1
0
0
+
0
0
:
?
2
3
8
+
:
?
+
4
+
After adding 0 to 0:
1
0
0
0
:
?
2
3
8
+
:
?
+
4
+
After the first :?
chooses 0
:
1
0
2
3
8
+
:
?
+
4
+
After adding 3 and 8:
1
0
2
11
:
?
+
4
+
After the ?:
chooses 11:
1
11
+
4
+
After adding 1 and 11:
12
4
+
Finally:
16
This may suggest that it's possible to transform an expression with ?:
into reverse polish notation. As the webpage says RPN and AST are equivalent in that they could be transformed into and from each other, the ternary operator should be able to be implemented with Precedence Climbing in a similar fashion.
One thing that needs be done seems to be the "priority" (or precedence) of the ?:
operator. And I have actually come across it when attempting the RPN transform. I have given question marks and colons the lowest precedence:
As you can see from the example above, whenever we are about to execute ?:
, the precedence branch and the alternative branch and the predicate should have already been evaluated, resulting in a single number. This is guaranteed by precedence.
Following is the priority (precedence) table.
!
~
> *
/
%
> +
-
> &
> ^
> |
> &&
> ||
> ?
> :
The ?
and :
are of the least priority, meaning in expressions like 1?2+3:4+5, ?
and :
will never take the operands around them.
?
precedes :
in order to make :
appear before ?
in the RPN. In my understanding this is only a design choice because :
and ?
do not even necessarily have to split into 2 operators in the first place.
Hope this helps..
Reference: http://en.cppreference.com/w/cpp/language/operator_precedence
Define the colon to have lower precedence than the question mark. In other words, a ? b : c would be parsed as (a ? b) : c. This lets the parser construct an (if/then/empty-else) abstract syntax node, which would then be operated on by the ": operator" to provide the desired else component.
The precedence relationships also preserves the operator's composability. E.g., a ? b : c ? d : e parses as (a ? b) : (c ? d) : e, as one would expect.
Hope this helps.