In purely functional languages, data is immutable. With reference counting, creating a reference cycle requires changing already created data. It seems like purely functional languages could use reference counting without worrying about the possibility of cycles. Am is right? If so, why don't they?
I understand that reference counting is slower than GC in many cases, but at least it reduces pause times. It would be nice to have the option to use reference counting in cases where pause times are bad.
Relative to other managed languages like Java and C#, purely functional languages allocate like crazy. They also allocate objects of different sizes. The fastest known allocation strategy is to allocate from contiguous free space (sometimes called a "nursery") and to reserve a hardware register to point to the next available free space. Allocation from the heap becomes as fast as allocation from a stack.
Reference counting is fundamentally incompatible with this allocation strategy. Ref counting puts objects on free lists and takes them off again. Ref counting also has substantial overheads required for updating ref counts as new objects are created (which, as noted above, pure functional languages do like crazy).
Reference counting tends to do really well in situations like these:
To understand how the best high-performance ref-counting systems work today, look up the work of David Bacon and Erez Petrank.
Reference counting is MUCH slower than GC because it's not good for CPU. And GC most of the time can wait for idle time and also GC can be concurrent (on another thread). So that's the problem - GC is least evil and lots of tries shown that.
Your question is based on a faulty assumption. It's perfectly possible to have circular references and immutable data. Consider the following C# example which uses immutable data to create a circular reference.
This type of trick can be done in many functional languages and hence any collection mechanism must deal with the possibility of circular references. I'm not saying a ref counting mechanism is impossible with a circular reference, just that it must be dealt with.
Edit by ephemient
In response to the comment... this is trivial in Haskell
and barely any more effort in SML.
No mutation required.
There are a few things, I think.
(Once upon a time I think I maybe really 'knew' this, but now I am trying to remember/speculate, so don't take this as any authority.)
Not quite. You can create cyclic data structures using purely functional programming simply by defining mutually-recursive values at the same time. For example, in OCaml:
However, it is possible to define languages that make it impossible to create cyclic structures by design. The result is known as a unidirectional heap and its primary advantage is that garbage collection can be as simple as reference counting.
Some languages do prohibit cycles and use reference counting. Erlang and Mathematica are examples.
For example, in Mathematica when you reference a value you make a deep copy of it so mutating the original does not mutate the copy:
Consider this allegory told about David Moon, an inventor of the Lisp Machine: