I'm new to Scheme and functional programming in general. Can someone explain this code — specifically what kons
and knil
are? The goal is to flatten a list of lists.
(define (fold1 kons knil lst)
(if (null? lst)
knil
(fold1 kons (kons (car lst) knil) (cdr lst))))
I'm fairly certain kons
is a function as it's being applied to two arguments but still not totally sure about its functionality.
This is a (weird) fold
This is a generalized folding procedure. In Lisps, lists are represented by cons cells and the empty list, where each (proper) list is either the empty list ()
, or a cons cell whose car
is an element of the list and whose cdr
is the rest of the list. E.g., a list (1 2 3 4 5)
can be produced by
(cons 1 (cons 2 (cons 3 (cons 4 (cons 5 '())))))
The fold1
function that you've shown:
(define (fold1 kons knil lst)
(if (null? lst)
knil
(fold1 kons (kons (car lst) knil) (cdr lst))))
is a a way of taking a list like the one shown above and transforming it to:
(kons 5 (kons 4 (kons 3 (kons 2 (kons 1 knil)))))
This is a fold. This is an efficient generalization of lots of operations. For instance, if you use 0
as knil
and +
as kons
, you compute the sum of the elements in the list.
Usually folds are right or left associative. A proper left-associative fold would transform to
(kons (kons (kons (kons (kons knil 1) 2) 3) 4) 5)
which might be clearer when viewed with +
and infix notation:
(((((0 + 1) + 2) + 3) + 4) + 5)
The right associative fold would become
(1 + (2 + (3 + (4 + (5 + 0)))))
The left associative fold can be more efficient because the natural implementation is tail recursive, and elements are consumed from the list in the order that they can be extracted from the list. E.g., in the proper left associatve example, (kons knil 1)
can be evaluated first to produce some value v
, and then, in the same stack space, (kons v 2)
can be evaluated, and so on. The right associative method requires traversing to the end of the list first. A naïve implementation requires stack space proportional to the length of the list.
This fold1
mixes things up a bit, because it's processing the elements of the list in a left associative manner, but the order of the arguments to the combining function is reversed.
This type of definition can be used any time that you have a algebraic datatype. Since a list in Lisps is either the empty list, or an element and a list combined with cons, you can write a function that handles each of these cases, and produces a new value by “replacing” cons
with a combination function and the empty list with some designated value.
Flattening a list of lists
So, if you've got a list of lists, e.g., ((a b) (c d) (e f))
, it's constructed by
(cons '(a b) (cons '(c d) (cons '(e f) '())))
With a right associative fold, you transform it to:
(append '(a b) (append '(c d) (append '(e f) '())))
by using append
for kons
, and '()
for knil
. However, in this slightly mixed up fold, your structure will be
(kons '(e f) (kons '(c d) (kons '(a b) knil)))
so knil
can still be '()
, but kons
will need to be a function that calls append
, but swaps the argument order:
(define (flatten lists)
(fold1 (lambda (right left)
(append left right))
'()
lists))
And so we have:
(flatten '((a b) (c d) (e f)))
;=> (a b c d e f)
Flattening deeper lists of lists
Given that this is a fold
ing exercise, I expected that the list of lists are nested only one layer deep. However, since we've seen how to implement a simple flatten
(define (flatten lists)
(fold1 (lambda (right left)
(append left right))
'()
lists))
we can modify this to make sure that deeper lists are flattened, too. The kons
function now
(lambda (right left)
(append left right))
simply appends the two lists together. left
is the already appended and flattened list that we've been building up. right
is the new component that we're taking on now. If we make a call to flatten
that, too, that should flatten arbitrarily nested lists:
(define (flatten lists)
(fold1 (lambda (right left)
(append left (flatten right))) ; recursively flatten sublists
'()
lists))
This is almost right, except that now when we call (flatten '((a b) (c d)))
, we'll end up making a call to (flatten '(a b))
, which will in turn make a call to (flatten 'a)
, but flatten
is a wrapper for fold1
, and fold1
expects its arguments to be lists. We need to decide what to do when flatten
is called with a non-list. A simple approach is to have it return a list containing the non-list argument. That return value will mesh nicely with the append that's receiving the value.
(define (flatten lists) ; lists is not necessarily a list of lists anymore,
(if (not (pair? lists)) ; perhaps a better name should be chosen
(list lists)
(fold1 (lambda (right left)
(append left (flatten right)))
'()
lists)))
Now we have
(flatten '(a (b (c)) (((d)))))
;=> (a b c d)
The procedure shown is an implementation of fold
:
In functional programming, fold – also known variously as reduce, accumulate, aggregate, compress, or inject – refers to a family of higher-order functions that analyze a recursive data structure and recombine through use of a given combining operation the results of recursively processing its constituent parts, building up a return value
Take note:
- The
kons
parameter is a two-argument function that's used for "combining" the current element of the list being processed with the accumulated value
- The
knil
parameter is the accumulated output result
To see how this works, imagine for a moment that we have a function such as this:
(define (reverse knil lst)
(if (null? lst)
knil
(reverse (cons (car lst) knil) (cdr lst))))
(reverse '() '(1 2 3 4))
=> '(4 3 2 1)
In the above knil
is used to accumulate the result, and it starts in a value of '()
because we're building a list as output. And kons
is called cons
, which builds lists. Let's see another example:
(define (add knil lst)
(if (null? lst)
knil
(add (+ (car lst) knil) (cdr lst))))
(add 0 '(1 2 3 4))
=> 10
In the above knil
is used to accumulate the result, and it starts in a value of 0
because we're building a number as output. And kons
is called +
, which adds numbers.
By now you must have realized that both examples share the same structure of a solution, both consume an input list and the only things that change is how we "combine" the values pulled from the list and the starting accumulated value. If we're smart, we can factor out the parts that change into a higher order procedure, that receives the changing parts as parameters - thus fold1
is born:
(define (fold1 kons knil lst)
(if (null? lst)
knil
(fold1 kons (kons (car lst) knil) (cdr lst))))
And both of the above examples can be easily expressed in terms of fold1
, just pass along the right parameters:
(define (reverse lst)
(fold1 cons '() lst))
(define (add lst)
(fold1 + 0 lst))
Now for the second part of the question: if you want to flatten a list with fold1
you can try this:
(define (helper x lst)
(if (pair? x)
(fold1 helper lst x)
(cons x lst)))
(define (flatten lst)
(reverse (helper lst '())))
(flatten '(1 2 (3) (4 (5)) 6))
=> '(1 2 3 4 5 6)
Following code using 'named let'
and 'for'
loop can be used to flatten the list of elements which themselves may be lists:
(define (myflatten ll)
(define ol '())
(let loop ((ll ll))
(for ((i ll))
(if (list? i)
(loop i)
(set! ol (cons i ol)))))
(reverse ol))
(myflatten '(a () (b e (c)) (((d)))))
Output:
'(a b e c d)
However, it uses 'set!'
which is generally not preferred.
The 'for'
loop can also be replaced by 'named let'
recursion:
(define (myflatten ll)
(define ol '())
(let outer ((ll ll))
(let inner ((il ll))
(cond
[(empty? il)]
[(list? (first il))
(outer (first il))
(inner (rest il))]
[else
(set! ol (cons (first il) ol))
(inner (rest il))])))
(reverse ol))