How to understand and fix conflicts in PLY

2019-05-11 12:28发布

I am working on a SystemVerilog parser and I am running into many ply conflicts (both shift/reduce and reduce/reduce).

I currently have like 170+ conflicts and the problem I have is that I don't really understand the parser.out file generated by PLY. Without properly understanding that there is little I can do, so my goal is to understand what ply is reporting. All the PLY documentation is brief and not very explainatory...

Here you have one of my states, the first where a conflict is found apparently:

state 24

(134) attribute_instance_optional_list -> attribute_instance_list .
(136) attribute_instance_list -> attribute_instance_list . attribute_instance
(138) attribute_instance -> . LPAREN ASTERISK attr_spec_list ASTERISK RPAREN

  ! shift/reduce conflict for LPAREN resolved as shift
    PLUS            reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
    MINUS           reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
    EXCLAMATION     reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
    NEG             reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
    AMPERSAND       reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
    NEGAMPERSAND    reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
    PIPE            reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
    NEGPIPE         reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
    CARET           reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
    NEGCARET        reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
    UNBASED_UNSIZED_LITERAL reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
    STRING_LITERAL  reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
    REAL_FLOATINGP_NUMBER reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
    REAL_FIXEDP_NUMBER reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
    INT_HEX_NUMBER  reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
    INT_BINARY_NUMBER reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
    INT_OCTAL_NUMBER reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
    INT_DECIMAL_NUMBER reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
    UNSIGNED_NUMBER reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
    DOUBLEPLUS      reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
    DOUBLEMINUS     reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
    AT              reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
    TAGGED          reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
    INOUT           reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
    INPUT           reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
    OUTPUT          reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
    REF             reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
    ID              reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
    ESCAPED_ID      reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
    MODULE          reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
    MACROMODULE     reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .)
    LPAREN          shift and go to state 21

  ! LPAREN          [ reduce using rule 134 (attribute_instance_optional_list -> attribute_instance_list .) ]

    attribute_instance             shift and go to state 49

As far as I understand ply, grammar rules are processed and states are built. Each of those states takes decisions based on the tokens that are coming in. So in this state that I posted (state 24), for example, if a PLUS token was waiting to be shifted in the stack, ply would go ahead and "reduce using rule 134". One thing I don't understand is, what does ply do then? I mean does it stay in the same state (24)? Is it only when an "attribute_instance" is waiting to be shifted in, when ply actualy moves states and goes to state 49?

Another question, what do the parsing "snapshots" listed at the beggining of the state mean?

(134) attribute_instance_optional_list -> attribute_instance_list .
(136) attribute_instance_list -> attribute_instance_list . attribute_instance
(138) attribute_instance -> . LPAREN ASTERISK attr_spec_list ASTERISK RPAREN

Does PLY compute all the possible stack states under which state 24 could be reached? is that even possible?

In case it is of any use, here you can see my grammar's rules:

Grammar

Rule 0     S' -> source_text
Rule 1     source_text -> timeunits_declaration description_list
Rule 2     timeunits_declaration -> timeunit_and_precision
Rule 3     timeunits_declaration -> timeunit
Rule 4     timeunits_declaration -> timeprecision
Rule 5     timeunits_declaration -> timeunit timeprecision
Rule 6     timeunits_declaration -> timeprecision timeunit
Rule 7     timeunits_declaration -> empty
Rule 8     timeunit_and_precision -> TIMEUNIT time_literal SLASH time_literal SEMICOLON
Rule 9     timeunit -> TIMEUNIT time_literal SEMICOLON
Rule 10    timeprecision -> TIMEPRECISION time_literal SEMICOLON
Rule 11    time_literal -> UNSIGNED_NUMBER time_unit
Rule 12    time_literal -> REAL_FIXEDP_NUMBER time_unit
Rule 13    time_unit -> S
Rule 14    time_unit -> MS
Rule 15    time_unit -> US
Rule 16    time_unit -> NS
Rule 17    time_unit -> PS
Rule 18    time_unit -> FS
Rule 19    description_list -> description_list description
Rule 20    description_list -> description
Rule 21    description -> module_declaration
Rule 22    module_declaration -> module_nonansi_header timeunits_declaration module_item_list module_footer
Rule 23    module_declaration -> module_ansi_header timeunits_declaration non_port_module_item_list module_footer
Rule 24    module_declaration -> module_implicit_header timeunits_declaration module_item module_footer
Rule 25    module_declaration -> EXTERN module_nonansi_header
Rule 26    module_declaration -> EXTERN module_ansi_header
Rule 27    module_nonansi_header -> attribute_instance_optional_list module_keyword lifetime module_identifier package_import_declaration_list parameter_port_list list_of_ports SEMICOLON
Rule 28    module_ansi_header -> attribute_instance_optional_list module_keyword lifetime module_identifier package_import_declaration_list parameter_port_list list_of_port_declarations_list SEMICOLON
Rule 29    module_implicit_header -> attribute_instance_optional_list module_keyword lifetime module_identifier LPAREN DOT ASTERISK RPAREN SEMICOLON
Rule 30    module_keyword -> MODULE
Rule 31    module_keyword -> MACROMODULE
Rule 32    module_footer -> ENDMODULE COLON module_identifier
Rule 33    module_footer -> ENDMODULE
Rule 34    module_item -> port_declaration SEMICOLON
Rule 35    module_item -> non_port_module_item
Rule 36    port_declaration -> attribute_instance_optional_list inout_declaration
Rule 37    port_declaration -> attribute_instance_optional_list input_declaration
Rule 38    port_declaration -> attribute_instance_optional_list output_declaration
Rule 39    port_declaration -> attribute_instance_optional_list ref_declaration
Rule 40    port_declaration -> attribute_instance_optional_list interface_port_declaration
Rule 41    inout_declaration -> INOUT net_port_type list_of_port_identifiers
Rule 42    input_declaration -> INPUT net_port_type list_of_port_identifiers
Rule 43    input_declaration -> INPUT variable_port_type list_of_variable_identifiers
Rule 44    output_declaration -> OUTPUT net_port_type list_of_port_identifiers
Rule 45    interface_port_declaration -> interface_identifier list_of_interface_identifiers
Rule 46    interface_port_declaration -> interface_identifier DOT modport_identifier list_of_interface_identifiers
Rule 47    ref_declaration -> REF variable_port_type list_of_variable_identifiers
Rule 48    casting_type -> simple_type
Rule 49    casting_type -> constant_primary
Rule 50    casting_type -> signing
Rule 51    casting_type -> STRING
Rule 52    casting_type -> CONST
Rule 53    data_type -> integer_vector_type optional_signing optional_packed_dimension
Rule 54    data_type -> integer_atom_type optional_signing
Rule 55    data_type -> non_integer_type
Rule 56    data_type -> struct_union LBRACE struct_union_member_list RBRACE optional_packed_dimension_list
Rule 57    data_type -> ENUM LBRACE optional_enum_name_declaration_list RBRACE optional_packed_dimension_list
Rule 58    data_type -> ENUM enum_base_type LBRACE optional_enum_name_declaration_list RBRACE optional_packed_dimension_list
Rule 59    data_type -> STRING
Rule 60    data_type -> CHANDLE
Rule 61    data_type -> VIRTUAL interface_identifier optional_parameter_value_assignment optional_modport_identifier
Rule 62    data_type -> VIRTUAL INTERFACE interface_identifier optional_parameter_value_assignment optional_modport_identifier
Rule 63    data_type -> type_identifier optional_packed_dimension_list
Rule 64    data_type -> class_scope type_identifier optional_packed_dimension_list
Rule 65    data_type -> package_scope type_identifier optional_packed_dimension_list
Rule 66    data_type -> class_type
Rule 67    data_type -> EVENT
Rule 68    data_type -> ps_covergroup_identifier
Rule 69    data_type -> type_reference
Rule 70    data_type_or_implicit -> data_type
Rule 71    data_type_or_implicit -> implicit_data_type
Rule 72    implicit_data_type -> optional_signing optional_packed_dimension_list
Rule 73    enum_base_type -> integer_atom_type optional_signing
Rule 74    enum_base_type -> integer_vector_type optional_signing optional_packed_dimension
Rule 75    enum_base_type -> type_identifier optional_packed_dimension
Rule 76    enum_name_declaration -> enum_identifier optional_enum_identifier_pointer
Rule 77    enum_name_declaration -> enum_identifier optional_enum_identifier_pointer EQUALS constant_expression
Rule 78    optional_enum_identifier_pointer -> LBRACKET integral_number RBRACKET
Rule 79    optional_enum_identifier_pointer -> LBRACKET integral_number COLON integral_number RBRACKET
Rule 80    optional_enum_identifier_pointer -> empty
Rule 81    class_scope -> class_type DOUBLECOLON
Rule 82    class_type -> ps_class_identifier optional_parameter_value_assignment
Rule 83    class_type -> ps_class_identifier optional_parameter_value_assignment parametrized_class_list
Rule 84    parametrized_class_list -> parametrized_class_list DOUBLECOLON class_identifier optional_parameter_value_assignment
Rule 85    parametrized_class_list -> DOUBLECOLON class_identifier optional_parameter_value_assignment
Rule 86    integer_type -> integer_vector_type
Rule 87    integer_type -> integer_atom_type
Rule 88    integer_atom_type -> BYTE
Rule 89    integer_atom_type -> SHORTINT
Rule 90    integer_atom_type -> INT
Rule 91    integer_atom_type -> LONGINT
Rule 92    integer_atom_type -> INTEGER
Rule 93    integer_atom_type -> TIME
Rule 94    integer_vector_type -> BIT
Rule 95    integer_vector_type -> LOGIC
Rule 96    integer_vector_type -> REG
Rule 97    non_integer_type -> SHORTREAL
Rule 98    non_integer_type -> REAL
Rule 99    non_integer_type -> REALTIME
Rule 100   net_type -> SUPPLY0
Rule 101   net_type -> SUPPLY1
Rule 102   net_type -> TRI
Rule 103   net_type -> TRIAND
Rule 104   net_type -> TRIOR
Rule 105   net_type -> TRIREG
Rule 106   net_type -> TRI0
Rule 107   net_type -> TRI1
Rule 108   net_type -> UWIRE
Rule 109   net_type -> WIRE
Rule 110   net_type -> WAND
Rule 111   net_type -> WOR
Rule 112   net_port_type -> data_type_or_implicit
Rule 113   net_port_type -> net_type data_type_or_implicit
Rule 114   net_port_type -> net_type_identifier
Rule 115   net_port_type -> INTERCONNECT implicit_data_type
Rule 116   variable_port_type -> var_data_type
Rule 117   var_data_type -> data_type
Rule 118   var_data_type -> VAR data_type_or_implicit
Rule 119   signing -> SIGNED
Rule 120   signing -> UNSIGNED
Rule 121   simple_type -> integer_type
Rule 122   simple_type -> non_integer_type
Rule 123   simple_type -> ps_type_identifier
Rule 124   simple_type -> ps_parameter_identifier
Rule 125   struct_union_member -> attribute_instance_optional_list data_type_or_void list_of_variable_decl_assignments
Rule 126   struct_union_member -> attribute_instance_optional_list random_qualifier data_type_or_void list_of_variable_decl_assignments
Rule 127   data_type_or_void -> data_type
Rule 128   data_type_or_void -> VOID
Rule 129   struct_union -> STRUCT
Rule 130   struct_union -> UNION
Rule 131   struct_union -> UNION TAGGED
Rule 132   type_reference -> TYPE LPAREN expression RPAREN
Rule 133   type_reference -> TYPE LPAREN data_type RPAREN
Rule 134   attribute_instance_optional_list -> attribute_instance_list
Rule 135   attribute_instance_optional_list -> empty
Rule 136   attribute_instance_list -> attribute_instance_list attribute_instance
Rule 137   attribute_instance_list -> attribute_instance
Rule 138   attribute_instance -> LPAREN ASTERISK attr_spec_list ASTERISK RPAREN
Rule 139   attr_spec_list -> attr_spec_list COMMA attr_spec
Rule 140   attr_spec_list -> attr_spec
Rule 141   attr_spec -> attr_name
Rule 142   attr_spec -> attr_name EQUALS constant_expression
Rule 143   attr_name -> identifier
Rule 144   inc_or_dec_expression -> inc_or_dec_operator attribute_instance_optional_list variable_lvalue
Rule 145   inc_or_dec_expression -> variable_lvalue attribute_instance_optional_list inc_or_dec_operator
Rule 146   conditional_expression -> cond_predicate INTERROGATION attribute_instance_optional_list expression COLON expression
Rule 147   constant_expression -> constant_primary
Rule 148   constant_expression -> unary_operator attribute_instance_optional_list constant_primary
Rule 149   constant_expression -> constant_expression binary_operator attribute_instance_optional_list constant_expression
Rule 150   constant_expression -> constant_expression INTERROGATION attribute_instance_optional_list constant_expression COLON constant_expression
Rule 151   constant_mintypmax_expression -> constant_expression
Rule 152   constant_mintypmax_expression -> constant_expression COLON constant_expression COLON constant_expression
Rule 153   constant_param_expression -> constant_mintypmax_expression
Rule 154   constant_param_expression -> data_type
Rule 155   constant_param_expression -> DOLLAR
Rule 156   param_expression -> mintypmax_expression
Rule 157   param_expression -> data_type
Rule 158   param_expression -> DOLLAR
Rule 159   constant_range_expression -> constant_expression
Rule 160   constant_range_expression -> constant_part_select_range
Rule 161   constant_part_select_range -> constant_range
Rule 162   constant_part_select_range -> constant_indexed_range
Rule 163   constant_range -> constant_expression COLON constant_expression
Rule 164   constant_indexed_range -> constant_expression PLUSCOLON constant_expression
Rule 165   constant_indexed_range -> constant_expression MINUSCOLON constant_expression
Rule 166   expression -> primary
Rule 167   expression -> unary_operator attribute_instance_optional_list primary
Rule 168   expression -> inc_or_dec_expression
Rule 169   expression -> LPAREN operator_assignment RPAREN
Rule 170   expression -> expression binary_operator attribute_instance_optional_list expression
Rule 171   expression -> conditional_expression
Rule 172   expression -> inside_expression
Rule 173   expression -> tagged_union_expression
Rule 174   tagged_union_expression -> TAGGED member_identifier
Rule 175   tagged_union_expression -> TAGGED member_identifier expression
Rule 176   inside_expression -> expression INSIDE LBRACE open_range_list RBRACE
Rule 177   value_range -> expression
Rule 178   value_range -> LBRACKET expression COLON expression RBRACKET
Rule 179   mintypmax_expression -> expression
Rule 180   mintypmax_expression -> expression COLON expression COLON expression
Rule 181   module_path_conditional_expression -> module_path_expression INTERROGATION attribute_instance_optional_list module_path_expression COLON module_path_expression
Rule 182   module_path_expression -> module_path_primary
Rule 183   module_path_expression -> unary_module_path_operator attribute_instance_optional_list module_path_primary
Rule 184   module_path_expression -> module_path_expression binary_module_path_operator attribute_instance_optional_list module_path_expression
Rule 185   module_path_expression -> module_path_conditional_expression
Rule 186   module_path_mintypmax_expression -> module_path_expression
Rule 187   module_path_mintypmax_expression -> module_path_expression COLON module_path_expression COLON module_path_expression
Rule 188   part_select_range -> constant_range
Rule 189   part_select_range -> indexed_range
Rule 190   indexed_range -> expression PLUSCOLON constant_expression
Rule 191   indexed_range -> expression MINUSCOLON constant_expression
Rule 192   genvar_expression -> constant_expression
Rule 193   constant_primary -> primary_literal
Rule 194   primary_literal -> number
Rule 195   primary_literal -> time_literal
Rule 196   primary_literal -> UNBASED_UNSIZED_LITERAL
Rule 197   primary_literal -> STRING_LITERAL
Rule 198   number -> REAL_FLOATINGP_NUMBER
Rule 199   number -> REAL_FIXEDP_NUMBER
Rule 200   number -> INT_HEX_NUMBER
Rule 201   number -> INT_BINARY_NUMBER
Rule 202   number -> INT_OCTAL_NUMBER
Rule 203   number -> INT_DECIMAL_NUMBER
Rule 204   number -> UNSIGNED_NUMBER
Rule 205   unary_operator -> PLUS
Rule 206   unary_operator -> MINUS
Rule 207   unary_operator -> EXCLAMATION
Rule 208   unary_operator -> NEG
Rule 209   unary_operator -> AMPERSAND
Rule 210   unary_operator -> NEGAMPERSAND
Rule 211   unary_operator -> PIPE
Rule 212   unary_operator -> NEGPIPE
Rule 213   unary_operator -> CARET
Rule 214   unary_operator -> NEGCARET
Rule 215   binary_operator -> PLUS
Rule 216   binary_operator -> MINUS
Rule 217   binary_operator -> ASTERISK
Rule 218   binary_operator -> SLASH
Rule 219   binary_operator -> PERCENT
Rule 220   binary_operator -> ISEQUAL
Rule 221   binary_operator -> NISEQUAL
Rule 222   binary_operator -> CISEQUAL
Rule 223   binary_operator -> NCISEQUAL
Rule 224   binary_operator -> WISEQUAL
Rule 225   binary_operator -> NWISEQUAL
Rule 226   binary_operator -> DOUBLEAMPERSAND
Rule 227   binary_operator -> DOUBLEPIPE
Rule 228   binary_operator -> DOUBLEASTERISK
Rule 229   binary_operator -> LT
Rule 230   binary_operator -> LE
Rule 231   binary_operator -> GT
Rule 232   binary_operator -> GE
Rule 233   binary_operator -> AMPERSAND
Rule 234   binary_operator -> PIPE
Rule 235   binary_operator -> CARET
Rule 236   binary_operator -> NEGCARET
Rule 237   binary_operator -> RSHIFT
Rule 238   binary_operator -> LSHIFT
Rule 239   binary_operator -> ARSHIFT
Rule 240   binary_operator -> ALSHIFT
Rule 241   binary_operator -> IMPLICATION
Rule 242   binary_operator -> EQUIVALENCE
Rule 243   inc_or_dec_operator -> DOUBLEPLUS
Rule 244   inc_or_dec_operator -> DOUBLEMINUS
Rule 245   unary_module_path_operator -> EXCLAMATION
Rule 246   unary_module_path_operator -> NEG
Rule 247   unary_module_path_operator -> AMPERSAND
Rule 248   unary_module_path_operator -> NEGAMPERSAND
Rule 249   unary_module_path_operator -> PIPE
Rule 250   unary_module_path_operator -> NEGPIPE
Rule 251   unary_module_path_operator -> CARET
Rule 252   unary_module_path_operator -> NEGCARET
Rule 253   binary_module_path_operator -> ISEQUAL
Rule 254   binary_module_path_operator -> NISEQUAL
Rule 255   binary_module_path_operator -> DOUBLEAMPERSAND
Rule 256   binary_module_path_operator -> DOUBLEPIPE
Rule 257   binary_module_path_operator -> AMPERSAND
Rule 258   binary_module_path_operator -> PIPE
Rule 259   binary_module_path_operator -> CARET
Rule 260   binary_module_path_operator -> NEGCARET
Rule 261   array_identifier -> identifier
Rule 262   block_identifier -> identifier
Rule 263   bin_identifier -> identifier
Rule 264   c_identifier -> C_ID
Rule 265   cell_identifier -> identifier
Rule 266   checker_identifier -> identifier
Rule 267   class_identifier -> identifier
Rule 268   class_variable_identifier -> variable_identifier
Rule 269   clocking_identifier -> identifier
Rule 270   config_identifier -> identifier
Rule 271   const_identifier -> identifier
Rule 272   constraint_identifier -> identifier
Rule 273   covergroup_identifier -> identifier
Rule 274   covergroup_variable_identifier -> variable_identifier
Rule 275   cover_point_identifier -> identifier
Rule 276   cross_identifier -> identifier
Rule 277   dynamic_array_variable_identifier -> variable_identifier
Rule 278   enum_identifier -> identifier
Rule 279   escaped_identifier -> ESCAPED_ID
Rule 280   formal_identifier -> identifier
Rule 281   formal_port_identifier -> identifier
Rule 282   function_identifier -> identifier
Rule 283   generate_block_identifier -> identifier
Rule 284   genvar_identifier -> identifier
Rule 285   hierarchical_array_identifier -> hierarchical_identifier
Rule 286   hierarchical_block_identifier -> hierarchical_identifier
Rule 287   hierarchical_event_identifier -> hierarchical_identifier
Rule 288   hierarchical_identifier -> optional_identifier_constant_bit_select_list identifier
Rule 289   hierarchical_identifier -> DOLLAR ROOT DOT optional_identifier_constant_bit_select_list identifier
Rule 290   hierarchical_net_identifier -> hierarchical_identifier
Rule 291   hierarchical_parameter_identifier -> hierarchical_identifier
Rule 292   hierarchical_property_identifier -> hierarchical_identifier
Rule 293   hierarchical_sequence_identifier -> hierarchical_identifier
Rule 294   hierarchical_task_identifier -> hierarchical_identifier
Rule 295   hierarchical_tf_identifier -> hierarchical_identifier
Rule 296   hierarchical_variable_identifier -> hierarchical_identifier
Rule 297   identifier -> simple_identifier
Rule 298   identifier -> escaped_identifier
Rule 299   index_variable_identifier -> identifier
Rule 300   interface_identifier -> identifier
Rule 301   interface_instance_identifier -> identifier
Rule 302   inout_port_identifier -> identifier
Rule 303   input_port_identifier -> identifier
Rule 304   instance_identifier -> identifier
Rule 305   library_identifier -> identifier
Rule 306   member_identifier -> identifier
Rule 307   method_identifier -> identifier
Rule 308   modport_identifier -> identifier
Rule 309   module_identifier -> identifier
Rule 310   net_identifier -> identifier
Rule 311   net_type_identifier -> identifier
Rule 312   output_port_identifier -> identifier
Rule 313   package_identifier -> identifier
Rule 314   package_scope -> package_identifier DOUBLECOLON
Rule 315   package_scope -> DOLLAR UNIT DOUBLECOLON
Rule 316   optional_package_scope -> package_scope
Rule 317   optional_package_scope -> empty
Rule 318   parameter_identifier -> identifier
Rule 319   port_identifier -> identifier
Rule 320   production_identifier -> identifier
Rule 321   program_identifier -> identifier
Rule 322   property_identifier -> identifier
Rule 323   ps_class_identifier -> optional_package_scope class_identifier
Rule 324   ps_covergroup_identifier -> optional_package_scope covergroup_identifier
Rule 325   ps_checker_identifier -> optional_package_scope checker_identifier
Rule 326   ps_identifier -> optional_package_scope identifier
Rule 327   ps_or_hierarchical_array_identifier -> optional_package_scope hierarchical_array_identifier
Rule 328   ps_or_hierarchical_array_identifier -> implicit_class_handle DOT hierarchical_array_identifier
Rule 329   ps_or_hierarchical_array_identifier -> class_scope hierarchical_array_identifier
Rule 330   ps_or_hierarchical_net_identifier -> optional_package_scope net_identifier
Rule 331   ps_or_hierarchical_net_identifier -> hierarchical_net_identifier
Rule 332   ps_or_hierarchical_property_identifier -> optionnal_package_scope property_identifier
Rule 333   ps_or_hierarchical_property_identifier -> hierarchical_property_identifier
Rule 334   ps_or_hierarchical_sequence_identifier -> optional_package_scope sequence_identifier
Rule 335   ps_or_hierarchical_sequence_identifier -> hierarchical_sequence_identifier
Rule 336   ps_or_hierarchical_tf_identifier -> optional_package_scope tf_identifier
Rule 337   ps_or_hierarchical_tf_identifier -> hierarchical_tf_identifier
Rule 338   ps_parameter_identifier -> optional_package_scope parameter_identifier
Rule 339   ps_parameter_identifier -> class_scope parameter_identifier
Rule 340   ps_parameter_identifier -> ps_parameter_identifier_generate_list parameter_identifier
Rule 341   ps_parameter_identifier_generate_list -> ps_parameter_identifier_generate_list DOT ps_parameter_identifier_generate
Rule 342   ps_parameter_identifier_generate_list -> ps_parameter_identifier_generate
Rule 343   ps_parameter_identifier_generate -> generate_block_identifier LBRACKET constant_expression RBRACKET
Rule 344   ps_parameter_identifier_generate -> generate_block_identifier
Rule 345   ps_type_identifier -> type_identifier
Rule 346   ps_type_identifier -> LOCAL DOUBLECOLON type_identifier
Rule 347   ps_type_identifier -> package_scope type_identifier
Rule 348   sequence_identifier -> identifier
Rule 349   signal_identifier -> identifier
Rule 350   simple_identifier -> ID
Rule 351   specparam_identifier -> identifier
Rule 352   system_tf_identifier -> DOLLAR ID
Rule 353   task_identifier -> identifier
Rule 354   tf_identifier -> identifier
Rule 355   terminal_identifier -> identifier
Rule 356   topmodule_identifier -> identifier
Rule 357   type_identifier -> identifier
Rule 358   udp_identifier -> identifier
Rule 359   variable_identifier -> identifier
Rule 360   cond_predicate -> AT
Rule 361   implicit_class_handle -> AT
Rule 362   integral_number -> AT
Rule 363   lifetime -> AT
Rule 364   list_of_interface_identifiers -> AT
Rule 365   list_of_port_declarations_list -> AT
Rule 366   list_of_port_identifiers -> AT
Rule 367   list_of_ports -> AT
Rule 368   list_of_variable_decl_assignments -> AT
Rule 369   list_of_variable_identifiers -> AT
Rule 370   module_item_list -> AT
Rule 371   module_path_primary -> AT
Rule 372   non_port_module_item -> AT
Rule 373   non_port_module_item_list -> AT
Rule 374   open_range_list -> AT
Rule 375   operator_assignment -> AT
Rule 376   optional_enum_name_declaration_list -> AT
Rule 377   optional_identifier_constant_bit_select_list -> AT
Rule 378   optional_modport_identifier -> AT
Rule 379   optional_packed_dimension -> AT
Rule 380   optional_packed_dimension_list -> AT
Rule 381   optional_parameter_value_assignment -> AT
Rule 382   optional_signing -> AT
Rule 383   optionnal_package_scope -> AT
Rule 384   package_import_declaration_list -> AT
Rule 385   parameter_port_list -> AT
Rule 386   primary -> AT
Rule 387   random_qualifier -> AT
Rule 388   struct_union_member_list -> AT
Rule 389   variable_lvalue -> AT
Rule 390   empty -> <empty>

1条回答
再贱就再见
2楼-- · 2019-05-11 13:08

In LR parsing, we often talk about "items": an item is a production with a progress marker, usually written with a • but sometimes with a simple .. A state is just a collection of items; in effect, the state tells you the set of productions the parse might be inside.

There is one particularly special type of item: the item with a dot at the end:

(134) attribute_instance_optional_list -> attribute_instance_list .

This represents a production which could be finished, since the progress marker is at the end. If that is the correct production, the parser must then substitute the right-hand side for the left-hand side: this is the action referred to as "reducing" (since it is the opposite of "producing", which is what a "production" does).

However, the mere fact that you are in a state with a possible reduction does not mean that the reduction is possible. It is also necessary that the next token be consistent with the result of the reduction. If the next token could not follow the reduced non-terminal (in the context of the parser's state), then the reduction cannot be performed, so the parser will attempt a shift if one is possible.

Shifts are really simple. A shift is possible if one or more items in the state have the dot before the current lookahead symbol. Here, there is no question about additional lookahead because Ply (like many LALR parser generators) only creates LALR(1) parsers which only have a single lookahead in any state, so the only thing we have to go on is the symbol we are currently looking at, and it is reasonably obvious that we can only process it if some available item has that symbol in the next position.

If a given state with a given lookahead symbol can both shift and reduce, then you have a shift-reduce conflict; the parser doesn't know what to do. (If it has neither a shift nor a reduce available, that indicates that the input has a syntax error. That's how LR parsers identify syntax errors.)

The one important aspect of LR parsing is that a reduction must be performed immediately if it is going to be performed at all. That is, if we are in a state with a possible reduction, and the item's lookahead set indicates that the lookahead character is feasible, we must perform the reduction. We can't wait and see if it would be possible later, because there is no later for a reduction. In other words, anything to the left of the • in an item has already been reduced as much as it could be. (This is the R in LR parsing, which indicates that every reduction is "rightmost". If the use of "rightmost" doesn't make sense, don't worry about it; I only mentioned this fact in case you were wondering.)

Another thing which I might as well mention is that in LALR parsing ("Lookahead LR parsing"), a state is precisely defined by the set of items. Each item has an applicable lookahead set, but the lookahead sets don't form part of the state's identity. If the parser generator ends up producing two states with the same items but different lookahead sets, it must merge them into a single state, forming the union of each lookahead set. For full LR parsing, this limitation doesn't exist; you can (and do) have more than one state for a given set of items, and the result is that the parsing table is much larger and slightly more powerful.

Now, if a shift action is possible, you can mechanically figure out which state will be active after the shift. For example, from

(134) attribute_instance_optional_list -> attribute_instance_list .
(136) attribute_instance_list -> attribute_instance_list . attribute_instance
(138) attribute_instance -> . LPAREN ASTERISK attr_spec_list ASTERISK RPAREN

after shifting an LPAREN, the next state will have just one item:

(138) attribute_instance -> LPAREN . ASTERISK attr_spec_list ASTERISK RPAREN

(Note how the dot has moved.)

That was a simple case, since the next symbol is a terminal, ASTERISK. Most of the time, the next symbol after a shift will be a non-terminal, and in that case we need to add all of the productions for that non-terminal, with the dot at the beginning. (That's how states end up with more than one item.) So, for example, given the new state with one item and an input of ASTERISK (anything else will be an error, since this state has no reduction possibilities), then we will shift into a state which has the shifted item:

(138) attribute_instance -> LPAREN ASTERISK . attr_spec_list ASTERISK RPAREN

plus all the productions for attr_spec_list:

(139)   attr_spec_list -> . attr_spec_list COMMA attr_spec
(140)   attr_spec_list -> . attr_spec

plus all the productions for attr_spec (since we just added an item with the dot before attr_spec):

(141)   attr_spec -> . attr_name
(142)   attr_spec -> . attr_name EQUALS constant_expression

plus the production for attr_name:

(143)   attr_name -> . identifier

and so on until we stop seeing new non-terminals:

(297)   identifier -> . simple_identifier
(298)   identifier -> . escaped_identifier
(350)   simple_identifier -> . ID
(279)   escaped_identifier -> . ESCAPED_ID

OK, now the next token will have to be ID or ESCAPED_ID. Suppose it is ID. Now what? Well, we will shift into a state

(350)   simple_identifier -> ID .

with a possible reduction; assuming the lookahead symbol matches the lookahead set (I haven't and don't intend to explain how lookahead sets are computed for each state; there's an algorithm but its details aren't relevant here), then the ID will be reduced to simple_identifier. Then where does the parser go? Logically, it goes back to the state which generated the simple_identifier production, and shift the simple_identifier. As it happens, the state is the one we just created

(138)   attribute_instance -> LPAREN ASTERISK . attr_spec_list ASTERISK RPAREN
(139)   attr_spec_list -> . attr_spec_list COMMA attr_spec
(140)   attr_spec_list -> . attr_spec
(141)   attr_spec -> . attr_name
(142)   attr_spec -> . attr_name EQUALS constant_expression
(143)   attr_name -> . identifier
(297)   identifier -> . simple_identifier
(298)   identifier -> . escaped_identifier
(350)   simple_identifier -> . ID
(279)   escaped_identifier -> . ESCAPED_ID

and after we shift the simple_identifier, we end up with

(297)   identifier -> simple_identifier .

which is a state which requires a reduction to identifier, so once again back to the same state after which we find ourselves in

(143)   attr_name -> identifier . 

and then

(141)   attr_spec -> attr_name .
(142)   attr_spec -> attr_name . EQUALS constant_expression

But how did the parser know which state to go back to on each of those reductions? The answer is that the parser pushes the current state onto the parsing stack with every symbol. When it does a reduction, it pops the symbols from the right-hand side, discarding each associated state number, until it gets to the beginning of the right-hand-side, at which point the stack indicates which state that right-hand side came from. It then takes a look at that state, shifts the reduced non-terminal, and pushes the new shifted state onto the parse stack.

So I think that answers the questions "What do the lines at the beginning of the state description mean?" and "What state does the parser go to after a reduction?" The other two questions are easy to answer: "No, it doesn't compute all the possible predecessor states", and "Yes, it could (although it might end up adding predecessors which are actually not possible with any input) but it isn't useful for the parse." but since they're not horribly relevant to solving the shift-reduce conflict, I won't explain the answer further.

Going back to the actual shift-reduce conflict, the situation is that we are in the state

(134) attribute_instance_optional_list -> attribute_instance_list .
(136) attribute_instance_list -> attribute_instance_list . attribute_instance
(138) attribute_instance -> . LPAREN ASTERISK attr_spec_list ASTERISK RPAREN

which has a possible reduction, and we are considering the case where we see an LPAREN, for which there is a possible shift, and it turns out that the lookahead set for the first item also include LPAREN. Although the lookahead set is not shown in the PLY output, we can dig around in the grammar to see where it might have come from. The immediate source is attribute_instance_optional_list, of course, and we can find that in the grammar,although there are quite a few possibilities:

(27)    module_nonansi_header -> attribute_instance_optional_list module_keyword lifetime module_identifier package_import_declaration_list parameter_port_list list_of_ports SEMICOLON
(28)    module_ansi_header -> attribute_instance_optional_list module_keyword lifetime module_identifier package_import_declaration_list parameter_port_list list_of_port_declarations_list SEMICOLON
(29)    module_implicit_header -> attribute_instance_optional_list module_keyword lifetime module_identifier LPAREN DOT ASTERISK RPAREN SEMICOLON
(36)    port_declaration -> attribute_instance_optional_list inout_declaration
(37)    port_declaration -> attribute_instance_optional_list input_declaration
(38)    port_declaration -> attribute_instance_optional_list output_declaration
(39)    port_declaration -> attribute_instance_optional_list ref_declaration
(40)    port_declaration -> attribute_instance_optional_list interface_port_declaration
(125)   struct_union_member -> attribute_instance_optional_list data_type_or_void list_of_variable_decl_assignments
(126)   struct_union_member -> attribute_instance_optional_list random_qualifier data_type_or_void list_of_variable_decl_assignments
(144)   inc_or_dec_expression -> inc_or_dec_operator attribute_instance_optional_list variable_lvalue
(145)   inc_or_dec_expression -> variable_lvalue attribute_instance_optional_list inc_or_dec_operator
(146)   conditional_expression -> cond_predicate INTERROGATION attribute_instance_optional_list expression COLON expression
(148)   constant_expression -> unary_operator attribute_instance_optional_list constant_primary
(149)   constant_expression -> constant_expression binary_operator attribute_instance_optional_list constant_expression
(150)   constant_expression -> constant_expression INTERROGATION attribute_instance_optional_list constant_expression COLON constant_expression
(167)   expression -> unary_operator attribute_instance_optional_list primary
(170)   expression -> expression binary_operator attribute_instance_optional_list expression
(181)   module_path_conditional_expression -> module_path_expression INTERROGATION attribute_instance_optional_list module_path_expression COLON module_path_expression
(183)   module_path_expression -> unary_module_path_operator attribute_instance_optional_list module_path_primary
(184)   module_path_expression -> module_path_expression binary_module_path_operator attribute_instance_optional_list module_path_expression

As far as I can see, attribute_instance_optional_list does not appear at the end of any of those productions, which simplifies working out where the LPAREN conflict comes from. In all those cases, it is followed by a non-terminal, the possibilities being:

module_keyword
inout_declaration
input_declaration
output_declaration
ref_declaration
interface_port_declaration
data_type_or_void
random_qualifier
variable_lvalue
inc_or_dec_operator
constant_primary
constant_expression
primary
expression
module_path_primary
module_path_expression  

Now, if any of those non-terminals could start with an LPAREN, we have a possible shift-reduce conflict. And a couple of culprits spring out of the list: expression and similar.

So, there is the problem, in summary: an attribute_instance can start with a parenthesis, but an attribute_instance_list can also be followed by a parenthesis. So when you're in the middle of an attribute_instance_list and you see a (, you have no way of knowing whether to shift or reduce.

查看更多
登录 后发表回答