Hey guys I'm having trouble with some bison code... After compiling this chunk of code, i get 58 shift/reduce and 40 reduce/reduce conflicts... Any tips on how to constrain them, or point me to a good guide on how to do it? Thanks in advance!! (Everything that starts with T_, like T_get are tokens defined above this code)
start:T_get G
|T_head G
|T_post G
;
G:url canon
;
url: T_http T_dslash T_host T_abs_path
;
canon:GH canon|RH canon|EH canon|MB
;
GH:T_con T_akt T_seq GH|T_date T_akt D GH|T_trans T GH|
;
D:T_day T_slash T_month T_slash T_year T_hour T_akt T_min
;
T:T_chunked |T_gzip |T_deflate
;
RH:T_acc T_akt T_seq RH|T_ref T_akt T_seq RH|T_user T_akt T_seq RH|
;
EH:T_content T_akt T_seq EH|T_exp T_akt T_seq EH|
;
MB:T_seq|
;
%%
int main(){
yyparse();
}
Even if you dont have time to study the logic in the code, i would appreciate a general help in these conflicts,like how they are generated etc
Use bison's -v
option to get a verbose output giving all the states and conflicts in the generated parser. If you do that you'll see a state something like:
state 7
4 G: url . canon
T_acc shift, and go to state 12
:
T_trans shift, and go to state 19
T_user shift, and go to state 20
$end reduce using rule 13 (GH)
$end [reduce using rule 21 (RH)]
$end [reduce using rule 24 (EH)]
$end [reduce using rule 26 (MB)]
T_acc [reduce using rule 13 (GH)]
:
This is telling you that when the parser has seen a url
and is looking to recognize a canon
, it can't tell what to do -- should it recognize a empty GH
, RH
, EH
, or MB
, or should it shift a token to recognize a non-empty one of those?
These conflicts all come from the basic ambiguity in your grammar -- you have a canon
rule that consists of 0 or more repeats of GH
, RH
, and/or EH
, and each of those rules consist of 0 or more repeats of something. So it has no way of telling how many empty things to insert into the parse tree (as they consume no input, an arbitrary number can be added), and it has no idea whether to group similar things into a single GH
(or RH
or EH
) or into multiple distinct ones.
So to fix this, you need to decide what you want. Most probably, you don't care about the grouping of GH
/RH
/EH
structures, nor do you care about empty ones, so you should just get rid of the recursion for those rules:
GH:T_con T_akt T_seq|T_date T_akt D|T_trans T;
RH:T_acc T_akt T_seq|T_ref T_akt T_seq|T_user T_akt T_seq;
EH:T_content T_akt T_seq|T_exp T_akt T_seq;
Now each of those rules will match one construct, and if you have multiple, they'll be grouped together by the canon
rule. This fixes all the conflicts you have, but still leaves a potential problem -- your canon
rule is right recursive, so will suck the entire input onto the stack before reducing any rules (since for right-recursion, it reduces right-to-left). You probably want to make the rule left recursive instead -- the general rule in bison is to always use left recursion and not right recursion, unless you need right recursion for some reason. This gives you a grammar:
start : T_get G | T_head G | T_post G ;
G : url canon MB ;
url : T_http T_dslash T_host T_abs_path ;
canon : canon GH | canon RH | canon EH | ;
GH : T_con T_akt T_seq | T_date T_akt D | T_trans T ;
D : T_day T_slash T_month T_slash T_year T_hour T_akt T_min ;
T : T_chunked | T_gzip | T_deflate ;
RH : T_acc T_akt T_seq | T_ref T_akt T_seq | T_user T_akt T_seq ;
EH : T_content T_akt T_seq | T_exp T_akt T_seq ;
MB : T_seq | ;