I have a dependency graph that I have represented as a Map<Node, Collection<Node>>
(in Java-speak, or f(Node n) -> Collection[Node]
as a function; this is a mapping from a given node n
to a collection of nodes that depend on n
). The graph is potentially cyclic*.
Given a list badlist
of nodes, I would like to solve a reachability problem: i.e. generate a Map<Node, Set<Node>> badmap
that represents a mapping from each node N in the list badlist
to a set of nodes which includes N or other node that transitively depends on it.
Example:
(x -> y means node y depends on node x)
n1 -> n2
n2 -> n3
n3 -> n1
n3 -> n5
n4 -> n2
n4 -> n5
n6 -> n1
n7 -> n1
This can be represented as the adjacency map {n1: [n2], n2: [n3], n3: [n1, n5], n4: [n2, n5], n6: [n1], n7: [n1]}
.
If badlist = [n4, n5, n1]
then I expect to get badmap = {n4: [n4, n2, n3, n1, n5], n5: [n5], n1: [n1, n2, n3, n5]}
.
I'm floundering with finding graph algorithm references online, so if anyone could point me at an efficient algorithm description for reachability, I'd appreciate it. (An example of something that is not helpful to me is http://www.cs.fit.edu/~wds/classes/cse5081/reach/reach.html since that algorithm is to determine whether a specific node A is reachable from a specific node B.)
*cyclic: if you're curious, it's because it represents C/C++ types, and structures can have members which are pointers to the structure in question.
here's what I ended up using, based on @quaint's answer:
(requires a few Guava classes for convenience)
Just like with Christian Ammer, you take for
A
the adjacency matrix and use Boolean arithmetic, when doing the following, whereI
is the identity matrix.Furthermore, standard matrix multiplication (both arithmetical and logical) is
O(n^3)
, notO(n^2)
. But ifn <= 64
, you can sort of get rid of one factorn
, because you can do 64 bits in parallel on nowadays 64 bits machines. For larger graphs, 64 bits parallelism is useful, too, but shader techniques might even be better.EDIT: one can do 128 bits in parallel with SSE instructions, with AVX even more.
In Python:
Ordinary depth-first search or breadth-first search will do the trick: execute it once for each bad node.
You maybe should build a reachability matrix from your adjacency list for fast searches. I just found the paper Course Notes for CS336: Graph Theory - Jayadev Misra which describes how to build the reachability matrix from a adjacency matrix.
If
A
is your adjacency matrix, the reachability matrix would beR = A + A² + ... + A^n
wheren
is the number of nodes in the graph.A², A³, ...
can be calculated by:A² = A x A
A³ = A x A²
For the matrix multiplication the logical or is used in place of
+
and the logical and is used in place ofx
. The complexity is O(n^4).Here's a working Java solution: