It's easy enough to represent a tree or list in haskell using algebraic data types. But how would you go about typographically representing a graph? It seems that you need to have pointers. I'm guessing you could have something like
type Nodetag = String
type Neighbours = [Nodetag]
data Node a = Node a Nodetag Neighbours
And that would be workable. However it feels a bit decoupled; The links between different nodes in the structure don't really "feel" as solid as the links between the current previous and next elements in a list, or the parents and children of a node in a tree. I have a hunch that doing algebraic manipulations on the graph as I defined it would be somewhat hindered by the level of indirection introduced through the tag system.
It is primarily this feeling of doubt and perception of inelegance that causes me to ask this question. Is there a better/more mathematically elegant way of defining graphs in Haskell? Or have I stumbled upon something inherently hard/fundamental? Recursive data structures are sweet, but this seems to be something else. A self referential data structure in a different sense to how trees and lists are self referential. It's like lists and trees are self referential at the type level, but graphs are self referential at the value level.
So what's really going on?
In shang's answer you can see how to represent a graph using laziness. The problem with these representations is that they are very difficult to change. The knot-tying trick is useful only if you're going to build a graph once, and afterward it never changes.
In practice, should I actually want to do something with my graph, I use the more pedestrian representations:
If you're going to be changing or editing the graph frequently, I recommend using a representation based on Huet's zipper. This is the representation used internally in GHC for control-flow graphs. You can read about it here:
An Applicative Control-Flow Graph based on Huet's Zipper
Hoopl: A Modular, Reusable Library for Dataflow Analysis and Transformation
A few others have briefly mentioned
fgl
and Martin Erwig's Inductive Graphs and Functional Graph Algorithms, but it's probably worth writing up an answer that actually gives a sense of the data types behind the inductive representation approach.In his paper, Erwig presents the following types:
(The representation in
fgl
is slightly different, and makes good use of typeclasses - but the idea is essentially the same.)Erwig is describing a multigraph in which nodes and edges have labels, and in which all edges are directed. A
Node
has a label of some typea
; an edge has a label of some typeb
. AContext
is simply (1) a list of labeled edges pointing to a particular node, (2) the node in question, (3) the node's label, and (4) the list of labeled edges pointing from the node. AGraph
can then be conceived of inductively as eitherEmpty
, or as aContext
merged (with&
) into an existingGraph
.As Erwig notes, we can't freely generate a
Graph
withEmpty
and&
, as we might generate a list with theCons
andNil
constructors, or aTree
withLeaf
andBranch
. Too, unlike lists (as others have mentioned), there's not going to be any canonical representation of aGraph
. These are crucial differences.Nonetheless, what makes this representation so powerful, and so similar to the typical Haskell representations of lists and trees, is that the
Graph
datatype here is inductively defined. The fact that a list is inductively defined is what allows us to so succinctly pattern match on it, process a single element, and recursively process the rest of the list; equally, Erwig's inductive representation allows us to recursively process a graph oneContext
at a time. This representation of a graph lends itself to a simple definition of a way to map over a graph (gmap
), as well as a way to perform unordered folds over graphs (ufold
).The other comments on this page are great. The main reason I wrote this answer, however, is that when I read phrases such as "graphs are not algebraic," I fear that some readers will inevitably come away with the (erroneous) impression that no one's found a nice way to represent graphs in Haskell in a way that permits pattern matching on them, mapping over them, folding them, or generally doing the sort of cool, functional stuff that we're used to doing with lists and trees.