I know I can declare %parse-param {struct my_st *arg}
in a .y
file. And so yyparse()
is changed to be yyparse(struct my_st *arg)
. But how do I reference the argument in the flex rules? For example:
[0-9]+ { do_work(arg); return NUMBER; }
I want to make a reentrant parser, so I need to do this. Please help me, thanks!
You need to pass the argument through to yylex
. That requires a modification of both the bison parser description, so that the parser calls yylex
with the desired arguments, and the flex scanner description, so that the scanner generates yylex
with the desired parameters.
Bison and flex do not communicate with each other and do not see each other's source files. However, it is normal for the scanner to #include
the header file generated by bison, and bison allows the possibility of inserting code directly into this header file. That makes it possible to put the entire configuration into the bison file.
In bison, you specify additional parameters for yylex
using the %lex-param
directive. But you need to be aware of the additional argument(s) which would also be added to the call if you %define api.pure
.
If you use bison 3.0 or more recent, you can use
%param { struct my_st *arg }
as an abbreviation for
%lex-param { struct my_st *arg }
%parse-param { struct my_st *arg }
It makes sense to use a single directive (if your bison is sufficiently recent) because there is no way to insert a local variable declaration into the yyparse
function. So the only variables which could be passed through to yylex
are global variables and parameters to yyparse
. [Note 1]
Remember that it is your responsibility to declare yylex
and yyerror
in the bison file. Even if you use %lex-param
, bison will not automatically generate a declaration for yylex
. [Note 2]
Flex does normally generate a declaration for yylex
, so you cannot simply put your declaration into the bison-generated header file and then #include
it into the scanner. However, if the YY_DECL
macro is defined, then the flex-generated scanner will not forward-declare yylex
, and it will use the YY_DECL
macro in the definition of yylex
. You can use this feature to put the declaration of yylex
into the bison description in a way that it will be passed through to the flex scanner.
In bison, you can add declarations to the generated header using either %code requires
sections or %code provides
sections. The difference is that requires
segments are earlier in the header file, before YYSTYPE
and YYLTYPE
have been declared. If you use a pure parser, the yylex
prototype will usually refer to YYSTYPE
(and YYLTYPE
, if you use locations), so it needs to go in a %code provides
section. In order to interface gracefully with flex, you can use the YY_DECL
macro to generate the yylex
declaration.
So you might end up with something like the following: [Note 3]
file: mylanguage.y
%code requires {
#include <stdio.h>
typedef struct Context { ... } Context;
/* structs used in the %union declaration would go here */
}
%define api.pure full
%locations
%parse-param { Context* context }
%lex-param { Context* context }
%code provides {
#define YY_DECL \
int yylex(YYSTYPE* yylvalp, YYLTYPE* yyllocp, Context* context)
YY_DECL;
int yyerror(YYLTYPE* yyllocp, Context* context, const char* message);
}
Then you would just insert the generated header into your flex file in the normal way:
file: mylanguage.l
%{
/* System library includes */
#include "mylanguage.tab.h"
%}
The %define api.pure full
declaration in bison avoids the need for the global variables yylval
and yylloc
. However, there are a number of other internal globals used by a flex-generated scanner; in order to make the scanner truly re-entrant, you need to add %option reentrant
to your flex file. With that option, yylex
is expected to include the parameter yyscan_t yyscanner
(as are all the other lexer-related functions defined by flex). You need to manage the yyscanner
value, so it will need to be passed through yyparse
to yylex
as above. You also need to initialize and destroy it, as described in the flex manual. (Its name in yylex
must be precisely yyscanner
, if you are generating the yylex
prototype using YY_DECL
.)
If you also want to pass your own context object, you could add two parameters to yyparse
and yylex
, or you could include your context object inside the yyscan_t
object as described in the Flex manual section on Extra Data.
Finally, if you use the bison pure parser API, then you need to change the way you write flex actions. Instead of assigning to the global yylval
(eg. yylval.integer = atol(yytext);
), you need to assign through the pointer passed as an argument: yylvalp->integer = atol(yytext);
. (The name of the argument is up to you; the one I use here is the one I specified above.)
Notes
Older implementations allowed you to specify the arguments to yylex
by defining the macro YYLEX_PARAM
. This is no longer supported as of bison 3.0, so you shouldn't use it.
If you use %parse-param
, the additional parameter will also be added to yyerror
. And if you %define api-pure full
, yyerror
will also receive a location object. Your declaration of yyerror
needs to be consistent.
The %locations
directive forces bison to generate code which stores location information for each token. I use it here because it makes the prototypes predictable. Without it, the prototypes would include YYLTYPE
arguments only if you actually refer to a location somewhere in a semantic action. If you don't intend to use token locations, you might prefer to remove the %locations
directive and all the YYLTYPE
arguments. But usually the location information is useful.