I'm wondering about the purpose, or perhaps more correctly, the tasks of the "reader" during interpretation/compilation of Lisp programs.
From the pre-question-research I've just done, it seems to me that a reader (particular to Clojure in this case) can be thought of as a "syntactic preprocessor". It's main duties are the expansion of reader macros and primitive forms. So, two examples:
'cheese --> (quote cheese)
{"a" 1 "b" 2} --> (array-map "a" 1 "b" 2)
So the reader takes in the text of a program (consisting of S-Expressions) and then builds and returns an in-memory data-structure that can be evaluated directly.
How far from the truth is this (and have I over-simplified the whole process)? What other tasks does the reader perform? Considering a virtue of Lisps is their homoiconicity (code as data), why is there a need for lexical analysis (if such is indeed comparable to the job of the reader)?
Thanks!
Generally the reader in Lisp reads s-expressions and returns data structures. READ is an I/O operation: Input is a stream of characters and output is Lisp data.
The printer does the opposite: it takes Lisp data and outputs those as a stream of characters. Thus it can also print Lisp data to external s-expressions.
Note that interpretation means something specific: executing code by an Interpreter. But many Lisp systems (including Clojure) are using a compiler. The tasks of computing a value for a Lisp form is usually called evaluation. Evaluation can be implemented by interpretation, by compilation or by a mix of both.
S-Expression: symbolic expressions. External, textual representation of data. External means that s-expressions are what you see in text files, strings, etc. So s-expressions are made of characters on some, typically external, medium.
Lisp data structures: symbols, lists, strings, numbers, characters, ...
Reader: reads s-expressions and returns Lisp data structures.
Note that s-expressions also are used to encode Lisp source code.
In some Lisp dialects the reader is programmable and table driven (via the so-called read table). This read table contains reader functions for characters. For example the quote ' character is bound to a function that reads an expression and returns the value of (list 'quote expression). The number characters 0..9 are bound to functions that read a number (in reality this might be more complex, since some Lisps allow numbers to be read in different bases).
S-expressions provide the external syntax for data structures.
Lisp programs are written in external form using s-expressions. But not all s-expressions are valid Lisp programs:
the syntax of Lisp usually is defined on top of Lisp data.
IF has for example the following syntax (in Common Lisp http://www.lispworks.com/documentation/HyperSpec/Body/s_if.htm ):
So it expects a test-form, a then-form and an optional else-form.
As s-expressions the following are valid IF expressions:
But since Lisp programs are forms, we can also construct these forms using Lisp programs:
(list 'if '(foo) 1 2) is a Lisp program that returns a valid IF form.
This list can for example be executed with EVAL. EVAL expects list forms - not s-expressions. Remember s-expressions are only an external representation. To create a Lisp form, we need to READ it.
This is why it is said code is data. Lisp forms are expressed as internal Lisp data structures: lists, symbols, numbers, strings, .... In most other programming languages code is raw text. In Lisp s-expressions are raw text. When read with the function READ, s-expressions are turned into data.
Thus the basic interaction top-level in Lisp is called REPL, Read Eval Print Loop. It is a LOOP that repeatedly reads an s-expression, evaluates the lisp form and prints it:
So the most primitive REPL is:
Thus from a conceptual point of view, to answer your question, during evaluation the reader does nothing. It is not involved in evaluation. Evaluation is done by the function EVAL. The reader is invoked by a call to READ. Since EVAL uses Lisp data structures as input (and not s-expressions) the reader is run before the Lisp form gets evaluated (for example by interpretation or by compiling and executing it).