Scanner and parser interaction

I am new to flex/bison. Reading books, it seems that in nearly all compiler implementations, the parser interacts with the scanner in a "coroutine" manner, that whenever the parser needs a token, it calls the scanner to get one, and left the scanner aside when it's busy on shift/reduce. A natural question is that why not let the scanner produces the token-stream (from the input byte-stream) as a whole, and then pass the entire token-stream to the parser, thus there is no explicit interaction betw. the two? Well, I can image that there are some drawbacks in this manner, and I can also see some benefits of doing so.

My question is, is there a sort of "comprehensive" discussion on that aspect, or is there any compiler implementation uses different scanner/parser interaction scheme other than "coroutine" manner?

In the traditional arrangement, the parser calls the scanner whenever it needs a token.

That's the same logic as used in the scanner (or many other programs) which call the I/O library every time they need more input. That's not usually described as a coroutine, and I'm not convinced it's an accurate description of the parser/scanner interaction either.

In coroutine control flow, two functions call each other in tandem. That's not usually the way I/O is handled. The fread() interface does maintain state for the next call (the file position, at least, and maybe a buffer) but it the calls are self contained.

In a sense, there is no difference between calling yylex() to get the next token and calling scanf() to get the next data value.

This is not always the most convenient architecture for a scanner. Sometimes, it would be convenient for the scanner to be able to feed tokens into the parser. A typical use case is when the scanner is generating tokens, for exanple through macro expansion, but sometimes it is just that the match of a single scanner pattern contains more than one token.

Many parser generators, including Bison, can generate callable parsers, usually called "push parsers". In this model, the scanner calls the parser with each succesive token. This is still not a coroutine model, really; it is just control-flow inversion. In the analogy with ordinary I/O, it's the equivalent of taking a data processor which called fgets() to read each input line and rewriting it as a process_line() function which is given a line of data to process (and thus does not interact with the I/O library). An early implementation of push parsing can be found in the Lemon parser generator.

Coroutine-like control flow could be useful for creating a parser whose eventual input stream must be handled asynchronously. But that doesn't really require coroutining between the parser and the scanner; rather, it requires coroutining between the scanner and the input stream. Again, coroutining is not really necessary and might be overkill: inverting control flow should suffice. Flex does not provide a "push scanner" interface, but other scanner generators do. I believe this feature is supported by Re2c, for example.