can't access the token table yytname in bison

2019-07-31 16:32发布

I'm building a parser for an asset xchange format. And I'm including the %token-table directive in the bison file but, from the flex code I just can't access the table or the constants associated with it. That is when trying to compile this code:

Frame|FrameTransformMatrix|Mesh|MeshNormals|MeshMaterialList|Material {
    printf("A keyword: %s\n", yytext);
    yylval.charptr_type = yytext;

    int i;
    for (i = 0; i < YYNTOKENS; i++)
    {
        if (yytname[i] != 0
            && yytname[i][0] == '"'
            && !strncmp(yytname[i] + 1, yytext, strlen(yytext))
            && yytname[i][strlen(yytext) + 1] == '"'
            && yytname[i][strlen(yytext) + 2] == 0)
            return i;
    }
}

gcc says both YYNTOKENS and yytname are undeclared. So was the token table finally deprecated and wiped or what's the deal?

3条回答
唯我独甜
2楼-- · 2019-07-31 17:11

The Bison 2.6.2 manual says (on p82 in the PDF):

%token-table [Directive]

Generate an array of token names in the parser implementation file. The name of the array is yytname; yytname[i] is the name of the token whose internal Bison token code number is i. The first three elements of yytname correspond to the predefined tokens "$end", "error", and "$undefined"; after these come the symbols defined in the grammar file.

The name in the table includes all the characters needed to represent the token in Bison. For single-character literals and literal strings, this includes the surrounding quoting characters and any escape sequences. For example, the Bison single-character literal ’+’ corresponds to a three-character name, represented in C as "’+’"; and the Bison two-character literal string "\\/" corresponds to a five-character name, represented in C as "\"\\\\/\"".

When you specify %token-table, Bison also generates macro definitions for macros YYNTOKENS, YYNNTS, and YYNRULES, and YYNSTATES:

YYNTOKENS The highest token number, plus one.

YYNNTS The number of nonterminal symbols.

YYNRULES The number of grammar rules,

YYNSTATES The number of parser states (see Section 5.5 [Parser States], page 104).

It looks like it is supposed to be there.

When I tried a trivial grammar, the table was present:

#if YYDEBUG || YYERROR_VERBOSE || YYTOKEN_TABLE
/* YYTNAME[SYMBOL-NUM] -- String name of the symbol SYMBOL-NUM.
   First, the terminals, then, starting at YYNTOKENS, nonterminals.  */
static const char *const yytname[] =
{
  "$end", "error", "$undefined", "ABSINTHE", "NESTLING", "$accept",
  "anything", 0
};
#endif

Notes: the table is static; if you are trying to access it from outside the file, that will not work.

There is an earlier stanza in the source:

/* Enabling the token table.  */
#ifndef YYTOKEN_TABLE
# define YYTOKEN_TABLE 1
#endif

This ensures that the token table is defined.

查看更多
做个烂人
3楼-- · 2019-07-31 17:13

The easiest way to avoid the static symbol problem is to #include the lexer directly in the third section of the bison input file:

/* token declarations and such */
%%
/* grammar rules */
%%

#include "lex.yy.c"

int main() {
  /* the main routine that calls yyparse */
}

Then you just compile the .tab.c file, and that's all you need.

查看更多
Rolldiameter
4楼-- · 2019-07-31 17:15

There's a quick and easy way around the 'static' issue. I was trying to print a human readable abstract syntax tree with string representations of each non-terminal for my C to 6502 compiler. Here's what I did...

In your .y file in the last section, create a non-static variable called token_table

%%
#include <stdio.h>

extern char yytext[];
extern int column;
const char ** token_table;
...

Now, in the main method that calls yyparse, assign yytname to token_table

int main(int argc, char ** argv) {
    FILE * myfile;
    yydebug = 1;
    token_table = yytname;
    ...

Now, you can access token_table in any compilation unit simply by declaring it as an extern, as in:

extern const char ** token_table;

/* Using it later in that same compilation unit */
printf("%s", token_table[DOWHILE - 258 + 3]); /* prints "DOWHILE" */

For each node in your AST, if you assign it the yytokentype value found in y.tab.h, you simply subtract 258 and add 3 to index into token_table (yytname). You have to subtract 258 b/c that is where yytokentype starts enumerating at and you have to add 3 b/c yytname adds the three reserved symbols ("$end", "error", and "$undefined") at the start of the table.

For instance, my generated bison file has:

static const char *const yytname[] =
{
    "$end", "error", "$undefined", "DOWHILE", "UAND", "UMULT", "UPLUS",
    "UMINUS", "UBANG", "UTILDE", "ARR", "NOOP", "MEMBER", "POSTINC",
    ...

And, the defines header (run bison with the --defines=y.tab.h option):

/* Tokens.  */
#ifndef YYTOKENTYPE
# define YYTOKENTYPE
   /* Put the tokens into the symbol table, so that GDB and other debuggers
      know about them.  */
   enum yytokentype {
     DOWHILE = 258,
     UAND = 259,
     UMULT = 260,
     ...
查看更多
登录 后发表回答