Sharing means that temporary data is stored if it is going to be used multiple times. That is, a function evaluates it's arguments only once. An example would be:
let x = sin x in x * x
What other features contribute to sharing and how would they interact with the need for practical programs to perform IO?
The clearest example of sharing in functional programming comes from Clean, which is based on graph rewriting. There, a computation refers to a DAG, so we can view the expression (sin x) * (sin x)
as
(*)
/ \
sin x sin x
Graph rewriting systems have an explicit notion of sharing, so we could express that computation as
(*)
/ \
\ /
sin x
pointing the multiplication node to the same node, thereby sharing the computation of sin x
. Term rewriting systems do not have so explicit a notion of sharing, but the optimization is still relevant. In GHC, one can sometimes express sharing with local variables, e.g. rewriting
f x = (sin x) * (sin x)
into
f x = sinx * sinx
where sinx = sin x
But since the two are semantically equivalent, the compiler is free to implement both the same way, with or without sharing. By my understanding, the GHC will generally preserve sharing introduced with local variables and sometimes introduce it (adding sharing to the first), but with no formal expression of sharing in term rewriting systems either behaviour is implementation dependent (see tel's comment and answer).
Sharing touches IO because side-effecting values cannot be shared. If we consider an impure language, there is a difference between
(string-append (read-line)
(read-line))
and
(let ((s (read-line)))
(string-append s s))
The first executes the IO action twice, requesting two lines from the user and appending them; the second "shares" the IO action, executing it once and appending it to itself. In general, sharing a pure computation reduces execution time without changing the result of the program, while sharing a side-effecting value (one that may change over time, or interacts with the user) changes the result. For the compiler to automatically share computations, it needs to know that they are pure, and thus that reducing the number of evaluations does not matter. Clean does this with uniqueness types; an IO action has the type attribute UNQ, which tells the compiler that it should not be shared. Haskell does the same somewhat differently with the IO monad.
Sharing is about a kind of equality: pointer equality. In Haskell's value land (semantic interpretation) there is no such thing as sharing. Values can only be equal if they have instances of Eq
and then "equality" is defined to be the binary relation (==)
.
Sharing thus escapes the semantic interpretation by referring to this underlying pointer equality which exists by virtue of implementation instead of semantics. This is a useful side effect sometimes, though. Unfortunately, since Haskell is defined by its semantic interpretation, use of sharing is undefined behavior. It's tied to a particular implementation. A few libraries use sharing and thus have behavior tied to GHC.
There is one built-in sharing mechanism, though. This is exposed by the StableName
interface. We generate StableName a
objects using makeStableName :: a -> IO (StableName a)
and have an instance Eq (StableName a)
—thus StableName
introduces some kind of equality for any type.
StableName
equality is almost pointer equality. To quote the Haddock documentation
If sn1 :: StableName
and sn2 :: StableName
and sn1 == sn2
then sn1
and sn2
were created by calls to makeStableName
on the same object.
Note that this is just an if statement, not an if and only if. The fact that two things can be "pointer equivalent" but still have different stable names sometimes is (a) forced by the flexibility Haskell leaves to the GC and (b) a loophole that allows StableName
s to exist in any Haskell implementation even if there's no such thing as "pointer equality" in the implementation whatsoever.
These StableName
s still have no semantic meaning, but since they're introduced in the IO
monad that's "OK". Instead, they might be said to implement an (ironically) unstable subset of the smallest (most specific) equality relation possible on any type.
Your example is not an example of sharing -- it just multiplies a value with itself (and then throws the original value away).
Sharing is the case where some part of a data structure occurs twice in a larger data structure, or in different data structures. For example:
p = (1, 2)
pp = (p, p) -- p is shared between the first and second component of pp
xs = [1, 2, 3]
ys = 0::1::xs
zs = 5::xs -- ys and zs share the same tail
In memory, such sharing will result in a DAG structure.