Why is consing in Lisp slow?

2019-03-17 20:02发布

问题:

I read in the book 'On Lisp' that one should avoid excessive use of cons in the body of expanded macros. Why is cons considered to be an inefficient operation? Does Lisp not do structure sharing with the cons cells?

回答1:

You need to understand the jargon first.

CONS is a primitive operation to create a fresh cons cell.

Consing means allocation of memory on the heap in general, not only cons cells: Consing of numbers, Consing of arrays, Consing of CLOS objects, ...

The definition from the 'memory management glossary' says:

  1. cons(1) In Lisp, cons is a primitive operation creating a list element (from English "CONStruct"). By extension, a cons is the element created.
  2. cons(2) (for full details, see allocate) Allocation is the process of assigning resources. When requested to by the program, an application memory manager or allocator allocates a block of memory(2) for the program to store its data in. Allocation is also known as consing, from cons(1).

So, Graham uses the second meaning.

If Graham says 'Avoid consing especially. A utility which conses unnecessarily can ruin the performance of an otherwise efficient program.' - that means: avoid unnecessary memory allocation on the heap. But this can be true for any code - macro, function, whatever.

That was particular true when computers had less memory and Garbage Collection was more costly (especially when it needed to scan swapped out memory in virtual memory systems, where the physical memory is smaller than the virtual memory).

Today consing is less an issue, but can still be relevant.

Take for example a the function READ-LINE, it reads a line from a stream and 'conses' a new string. Note that the string is not built out of conses, but it is a vector. We mean here 'allocates a string on the heap'. If you have a large file with lots of lines, it can be faster to have a routine that gets a buffer and which fills the buffer with the line characters (there are vectors in Common Lisp with a fill pointer for this). That way the line buffer is just one object and can potentially be reused for calls to this line reading function.

See this example in the Allegro CL documentation: Some techniques for reducing consing when processing very large text files.



回答2:

I don't think cons is slow. It doesn't create a new copy of the entire list, it just appends a new element onto the front of a linked list.

If anything is slow, it's that it has to allocate some memory for the new node. If you're creating lots and lots of nodes it might be better to create them all in one go instead of one-by-one.



回答3:

Actually consing is quite fast in good implementations, but you can gain better performance by avoiding it. You can safely use destructive operations if the things you alter have been freshly created by yourself, as in this example:

CL-USER> (defun non-destructive ()
           (progn
             (reverse (loop for x from 1 to 100000 collecting x))
             nil))
NON-DESTRUCTIVE
CL-USER> (defun destructive ()
           (progn
             (nreverse (loop for x from 1 to 100000 collecting x))
             nil))
DESTRUCTIVE
CL-USER> (time (non-destructive))
(NON-DESTRUCTIVE) took 140 milliseconds (0.140 seconds) to run 
                    with 2 available CPU cores.
During that period, 156 milliseconds (0.156 seconds) were spent in user mode
                    0 milliseconds (0.000 seconds) were spent in system mode
94 milliseconds (0.094 seconds) was spent in GC.
 1,600,024 bytes of memory allocated.
NIL
CL-USER> (time (destructive))
(DESTRUCTIVE) took 93 milliseconds (0.093 seconds) to run 
                    with 2 available CPU cores.
During that period, 93 milliseconds (0.093 seconds) were spent in user mode
                    0 milliseconds (0.000 seconds) were spent in system mode
63 milliseconds (0.063 seconds) was spent in GC.
 800,024 bytes of memory allocated.
NIL

So: Yes, avoiding consing can improve performance, but you should only use destructive modification if you know what you're doing. I would not say that consing is "slow" per se, but nonetheless avoiding it can be beneficial. If you compare the difference in allocated memory (which takes its time), it should be clear why the second version is faster than the first one.



回答4:

Consing, which is Lisp jargon for dynamic memory allocation, isn't slow. But it adds overhead to the program. You cannot just look at the cost of allocating the object, but at the full cost of the lifecycle, from creating the object to reclaiming it when it has become garbage.

Unnecessary consing "exerts pressure" against the garbage collector; it makes garbage collection work more often.

(It's not a Lisp issue. For example, C programs that make lots of calls to malloc and free, or C++ programs that use new and delete a lot will also not perform as well as similar programs which are better organized to avoid many of these calls.)

You don't have to be paranoid about avoid consing in Lisp, because that will make programming inconvenient, and some of the tricks for avoiding consing, involving destructive re-use of existing objects, are prone to bugs.

But it can pay to watch out for in consing that happens in inner loops that are executed a large number of times, and in low-level code like utility functions that are intended for wide reuse. For instance suppose that mapcar internally built up the result list in reverse, and then put it in the correct order by calling reverse instead of nreverse. Then, everything that calls mapcar would suffer from unnecessary consing.