I'm building a parser for an asset xchange format. And I'm including the %token-table directive in the bison file but, from the flex code I just can't access the table or the constants associated with it. That is when trying to compile this code:
Frame|FrameTransformMatrix|Mesh|MeshNormals|MeshMaterialList|Material {
printf("A keyword: %s\n", yytext);
yylval.charptr_type = yytext;
int i;
for (i = 0; i < YYNTOKENS; i++)
{
if (yytname[i] != 0
&& yytname[i][0] == '"'
&& !strncmp(yytname[i] + 1, yytext, strlen(yytext))
&& yytname[i][strlen(yytext) + 1] == '"'
&& yytname[i][strlen(yytext) + 2] == 0)
return i;
}
}
gcc says both YYNTOKENS and yytname are undeclared. So was the token table finally deprecated and wiped or what's the deal?
The Bison 2.6.2 manual says (on p82 in the PDF):
%token-table
[Directive]
Generate an array of token names in the parser implementation file. The name of the
array is yytname
; yytname[i]
is the name of the token whose internal Bison token
code number is i
. The first three elements of yytname correspond to the predefined
tokens "$end
", "error
", and "$undefined
"; after these come the symbols defined in
the grammar file.
The name in the table includes all the characters needed to represent the token in
Bison. For single-character literals and literal strings, this includes the surrounding
quoting characters and any escape sequences. For example, the Bison single-character
literal ’+’
corresponds to a three-character name, represented in C as "’+’"
; and
the Bison two-character literal string "\\/"
corresponds to a five-character name,
represented in C as "\"\\\\/\""
.
When you specify %token-table, Bison also generates macro definitions for macros
YYNTOKENS
, YYNNTS
, and YYNRULES
, and YYNSTATES
:
YYNTOKENS
The highest token number, plus one.
YYNNTS
The number of nonterminal symbols.
YYNRULES
The number of grammar rules,
YYNSTATES
The number of parser states (see Section 5.5 [Parser States], page 104).
It looks like it is supposed to be there.
When I tried a trivial grammar, the table was present:
#if YYDEBUG || YYERROR_VERBOSE || YYTOKEN_TABLE
/* YYTNAME[SYMBOL-NUM] -- String name of the symbol SYMBOL-NUM.
First, the terminals, then, starting at YYNTOKENS, nonterminals. */
static const char *const yytname[] =
{
"$end", "error", "$undefined", "ABSINTHE", "NESTLING", "$accept",
"anything", 0
};
#endif
Notes: the table is static; if you are trying to access it from outside the file, that will not work.
There is an earlier stanza in the source:
/* Enabling the token table. */
#ifndef YYTOKEN_TABLE
# define YYTOKEN_TABLE 1
#endif
This ensures that the token table is defined.
There's a quick and easy way around the 'static' issue. I was trying to print a human readable abstract syntax tree with string representations of each non-terminal for my C to 6502 compiler. Here's what I did...
In your .y file in the last section, create a non-static variable called token_table
%%
#include <stdio.h>
extern char yytext[];
extern int column;
const char ** token_table;
...
Now, in the main method that calls yyparse, assign yytname to token_table
int main(int argc, char ** argv) {
FILE * myfile;
yydebug = 1;
token_table = yytname;
...
Now, you can access token_table in any compilation unit simply by declaring it as an extern, as in:
extern const char ** token_table;
/* Using it later in that same compilation unit */
printf("%s", token_table[DOWHILE - 258 + 3]); /* prints "DOWHILE" */
For each node in your AST, if you assign it the yytokentype value found in y.tab.h, you simply subtract 258 and add 3 to index into token_table (yytname). You have to subtract 258 b/c that is where yytokentype starts enumerating at and you have to add 3 b/c yytname adds the three reserved symbols ("$end", "error", and "$undefined") at the start of the table.
For instance, my generated bison file has:
static const char *const yytname[] =
{
"$end", "error", "$undefined", "DOWHILE", "UAND", "UMULT", "UPLUS",
"UMINUS", "UBANG", "UTILDE", "ARR", "NOOP", "MEMBER", "POSTINC",
...
And, the defines header (run bison with the --defines=y.tab.h option):
/* Tokens. */
#ifndef YYTOKENTYPE
# define YYTOKENTYPE
/* Put the tokens into the symbol table, so that GDB and other debuggers
know about them. */
enum yytokentype {
DOWHILE = 258,
UAND = 259,
UMULT = 260,
...
The easiest way to avoid the static symbol problem is to #include
the lexer directly in the third section of the bison input file:
/* token declarations and such */
%%
/* grammar rules */
%%
#include "lex.yy.c"
int main() {
/* the main routine that calls yyparse */
}
Then you just compile the .tab.c file, and that's all you need.