I was studying Lisp and I am not experienced in Lisp programming. In a part of my studies I encountered the below examples:
> (cons ‘a ‘(a b)) ----> (A A B)
> (cons ‘(a b) ‘a) ----> ((A B).A)
I was wondering why when we have (cons ‘a ‘(a b)) the response is (A A B) and why when we change it a little and put the 'a after (a b), the response is a dotted list like ((A B).A)? What is the difference between the first code line and the second one? What is going on behind these codes?
It's pretty easy to understand if you think of them as cons-cells.
In short, a cons cell consists of exactly two values. The normal notation for this is to use the dot, e.g.:
(cons 'a 'b) ==> (A . B)
But since lists are used so often in LISP, a better notation is to drop the dot.
Lists are made by having the second element be a new cons cell, with the last ending a terminator (usually nil, or '()
in Common Lisp). So these two are equal:
(cons 'a (cons 'b '())) ==> (A B)
(list 'a 'b) ==> (A B)
So (cons 'a 'b)
creates a cell [a,b]
, and (list 'a 'b)
will create [a, [b, nil]]
. Notice the convention for encoding lists in cons cells: They terminate with an inner nil
.
Now, if you cons 'a
onto the last list, you create a new cons cell containing [[a, [b, nil]], a]
. As this is not a "proper" list, i.e. it's not terminated with a nil
, the way to write it out is to use the dot: (cons '(a b) 'a) ==> ((a b) . a)
.
If the dot wasn't printed, it would have to have been a list with the structure [[a, [b, nil]], [a, nil]]
.
Your example
When you do (cons 'a '(a b))
it will take the symbol 'a
and the list '(a b)
and put them in a new cons cell. So this will consist of [a, [a, [b, nil]]]
. Since this naturally ends with an inner nil
, it's written without dots.
As for (cons '(a b) 'a)
, now you'll get [[a, [b, nil]], a]
. This does not terminate with an inner nil
, and therefore the dot notation will be used.
Can we use cons to make the last example end with an inner nil? Yes, if we do
(cons '(a b) (cons 'a '())) ==> ((A B) A)
And, finally,
(list '(a b) 'a))
is equivalent to
(cons (cons (cons 'a (cons 'b '())) (cons 'a '())))
See this visualization:
CL-USER 7 > (sdraw:sdraw '(A A B))
[*|*]--->[*|*]--->[*|*]--->NIL
| | |
v v v
A A B
CL-USER 8 > (sdraw:sdraw '((A B) . A))
[*|*]--->A
|
v
[*|*]--->[*|*]--->NIL
| |
v v
A B
Also:
CL-USER 9 > (sdraw:sdraw '(A B))
[*|*]--->[*|*]--->NIL
| |
v v
A B
CL-USER 10 > (sdraw:sdraw (cons 'A '(A B)))
[*|*]--->[*|*]--->[*|*]--->NIL
| | |
v v v
A A B
CL-USER 11 > (sdraw:sdraw (cons '(A B) 'A))
[*|*]--->A
|
v
[*|*]--->[*|*]--->NIL
| |
v v
A B
A list (a b c) is represented (stored internally) as three cons-cells: (cons 'a (cons 'b (cons 'c '())
. Note that the last pair has '() in its cdr.
Series of cons-cells whose last cdr is '() is printed as a list by the printer. The example is thus printed as (a b c).
Let's look at: (cons 'a '(a b))
.
The list '(a b) is represented as (cons 'a (cons 'b '()). This means that
(cons 'a '(a b))
produces: (cons 'a (cons 'a (cons 'b '()))
.
Let's look at: (cons '(a b) 'a)
.
The list '(a b) is represented as (cons 'a (cons 'b '()). This means that
(cons (cons '(a b) 'a))
produces (cons (cons 'a (cons 'b '()) 'a)
.
Note that this series does not end in '()
. To show that the printer uses dot notation. ( ... . 'a)
means that a value consists of a series of cons-cells and that the last cdr contains 'a
. The value (cons (cons 'a (cons 'b '()) 'a)
is thus printed as '((a b) . a)
.
A cons
is a data structure that can contain two values. Eg (cons 1 2) ; ==> (1 . 2)
. The first part is car
, the second is cdr
. A cons
is a list
if it's cdr
is either nil
or a list
. Thus
(1 . (2 . (3 . ())))
is a list.
When printing cons
the dot is omitted when the cdr
is a cons
or nil
. The outer parentheses of the cdr
is also omitted. Thus (3 . ())
is printed (3)
and (1 . (2 . (3 . ())))
is printed (1 2 3)
. It's the same structure, but with different visualization. A cons
in the car
does not have this rule.
The read function reads cons
with dot and the strange exceptional print format when the cdr
is a list. It will at read time behave as if it were cons
.
With a special rule for both read
and print
the illusion of a list is complete even when it's chains of cons
.
(cons ‘a ‘(a b)) ----> (A . (A B))
(cons ‘(a b) ‘a) ----> ((A B) . A)
When printing, the first is one list of 3 elements since the cdr
is a list.