In C++, the symbols '<' and '>' are used for comparisons as well as for signifying a template argument. Thus, the code snippet
[...] Foo < Bar > [...]
might be interpreted as any of the following two ways:
- An object of type Foo with template argument Bar
- Compare Foo to Bar, then compare the result to whatever comes next
How does the parser for a C++ compiler efficiently decide between those two possibilities?
If
Foo
is known to be a template name (e.g. atemplate <...> Foo ...
declaration is in scope, or the compiler sees atemplate Foo
sequence), thenFoo < Bar
cannot be a comparison. It must be a beginning of a template instantiation (or whateverFoo < Bar >
is called this week).If
Foo
is not a template name, thenFoo < Bar
is a comparison.In most cases it is known what
Foo
is, because identifiers generally have to be declared before use, so there's no problem to decide one way or the other. There's one exception though: parsing template code. IfFoo<Bar>
is inside a template, and the meaning ofFoo
depends on a template parameter, then it is not known whetherFoo
is a template or not. The language standard directs to treat it as a non-template unless preceded by the keywordtemplate
.The parser might implement this by feeding context back to the lexer. The lexer recognizes
Foo
as different types of tokens, depending on the context provided by the parser.C and C++ parsers are "context sensitive", in other words, for a given token or lexeme, it is not guaranteed to be distinct and have only one meaning - it depends on the context within which the token is used.
So, the parser part of the compiler will know (by understanding "where in the source it is") that it is parsing some kind of type or some kind of comparison (This is NOT simple to know, which is why reading the source of competent C or C++ compiler is not entirely straight forward - there are lots of conditions and function calls checking "is this one of these, if so do this, else do something else").
The keyword
template
helps the compiler understand what is going on, but in most cases, the compiler simply knows because<
doesn't make sense in the other aspect - and if it doesn't make sense in EITHER form, then it's an error, so then it's just a matter of trying to figure out what the programmer might have wanted - and this is one of the reasons that sometimes, a simple mistake such as a missing}
ortemplate
can lead the entire parsing astray and result in hundreds or thousands of errors [although sane compilers stop after a reasonable number to not fill the entire universe with error messages]Most of the answers here confuse determining the meaning of the symbol (what I call "name resolution") with parsing (defined narrowly as "can read the syntax of the program").
You can do these tasks separately..
What this means is that you can build a completely context-free parser for C++ (as my company, Semantic Designs does), and leave the issues of deciding what the meaning of the symbol is to a explicitly seperate following task.
Now, that task is driven by the possible syntax interpretations of the source code. In our parsers, these are captured as ambiguities in the parse.
What name resolution does is collect information about the declarations of names, and use that information to determine which of the ambiguous parses doesn't make sense, and simply drop those. What remains is a single valid parse, with a single valid interpretation.
The machinery to accomplish name resolution in practice is a big mess. But that's the C++ committee's fault, not the parser or name resolver. The ambiguity removal with our tool is actually done automatically, making that part actually pretty nice but if you don't look inside our tools you would not appreciate that, but we do because it means a small engineering team was able to build it.
See an example of resolution of template-vs-less than on C++s most vexing parse done by our parser.
The important point to remember is that C++ grammar is not context-free. I.e., when the parser sees
Foo < Bar
(in most cases) knows thatFoo
refers to a template definition (by looking it up in the symbol table), and thus<
cannot be a comparison.There are difficult cases, when you literally have to guide the parser. For example, suppose that are writing a class template with a template member function, which you want to specialize explicitly. You might have to use syntax like:
(in some cases; see Calling template function within template class for details)
Also, comparisons inside non-type template arguments must be surrounded by parentheses, i.e.:
not
Non-static data member initializers bring more fun: http://open-std.org/JTC1/SC22/WG21/docs/cwg_active.html#325