Finding all cycles in a directed graph

2018-12-31 08:40发布

How can I find (iterate over) ALL the cycles in a directed graph from/to a given node?

For example, I want something like this:

A->B->A
A->B->C->A

but not: B->C->B

17条回答
回忆,回不去的记忆
2楼-- · 2018-12-31 09:13

There are two steps (algorithms) involved in finding all cycles in a DAG.

The first step is to use Tarjan's algorithm to find the set of strongly connected components.

  1. Start from any arbitrary vertex.
  2. DFS from that vertex. For each node x, keep two numbers, dfs_index[x] and dfs_lowval[x]. dfs_index[x] stores when that node is visited, while dfs_lowval[x] = min(dfs_low[k]) where k is all the children of x that is not the directly parent of x in the dfs-spanning tree.
  3. All nodes with the same dfs_lowval[x] are in the same strongly connected component.

The second step is to find cycles (paths) within the connected components. My suggestion is to use a modified version of Hierholzer's algorithm.

The idea is:

  1. Choose any starting vertex v, and follow a trail of edges from that vertex until you return to v. It is not possible to get stuck at any vertex other than v, because the even degree of all vertices ensures that, when the trail enters another vertex w there must be an unused edge leaving w. The tour formed in this way is a closed tour, but may not cover all the vertices and edges of the initial graph.
  2. As long as there exists a vertex v that belongs to the current tour but that has adjacent edges not part of the tour, start another trail from v, following unused edges until you return to v, and join the tour formed in this way to the previous tour.

Here is the link to a Java implementation with a test case:

http://stones333.blogspot.com/2013/12/find-cycles-in-directed-graph-dag.html

查看更多
余欢
3楼-- · 2018-12-31 09:15

In the case of undirected graph, a paper recently published (Optimal listing of cycles and st-paths in undirected graphs) offers an asymptotically optimal solution. You can read it here http://arxiv.org/abs/1205.2766 or here http://dl.acm.org/citation.cfm?id=2627951 I know it doesn't answer your question, but since the title of your question doesn't mention direction, it might still be useful for Google search

查看更多
春风洒进眼中
4楼-- · 2018-12-31 09:15

Javascript solution using disjoint set linked lists. Can be upgraded to disjoint set forests for faster run times.

var input = '5\nYYNNN\nYYYNN\nNYYNN\nNNNYN\nNNNNY'
console.log(input);
//above solution should be 3 because the components are
//{0,1,2}, because {0,1} and {1,2} therefore {0,1,2}
//{3}
//{4}

//MIT license, authored by Ling Qing Meng

//'4\nYYNN\nYYYN\nNYYN\nNNNY'

//Read Input, preformatting
var reformat = input.split(/\n/);
var N = reformat[0];
var adjMatrix = [];
for (var i = 1; i < reformat.length; i++) {
    adjMatrix.push(reformat[i]);
}

//for (each person x from 1 to N) CREATE-SET(x)
var sets = [];
for (var i = 0; i < N; i++) {
    var s = new LinkedList();
    s.add(i);
    sets.push(s);
}

//populate friend potentials using combinatorics, then filters
var people =  [];
var friends = [];
for (var i = 0; i < N; i++) {
    people.push(i);
}
var potentialFriends = k_combinations(people,2);
for (var i = 0; i < potentialFriends.length; i++){
    if (isFriend(adjMatrix,potentialFriends[i]) === 'Y'){
        friends.push(potentialFriends[i]);
    }
}


//for (each pair of friends (x y) ) if (FIND-SET(x) != FIND-SET(y)) MERGE-SETS(x, y)
for (var i = 0; i < friends.length; i++) {
    var x = friends[i][0];
    var y = friends[i][1];
    if (FindSet(x) != FindSet(y)) {
        sets.push(MergeSet(x,y));
    }
}


for (var i = 0; i < sets.length; i++) {
    //sets[i].traverse();
}
console.log('How many distinct connected components?',sets.length);



//Linked List data structures neccesary for above to work
function Node(){
    this.data = null;
    this.next = null;
}

function LinkedList(){
    this.head = null;
    this.tail = null;
    this.size = 0;

    // Add node to the end
    this.add = function(data){
        var node = new Node();
        node.data = data;
        if (this.head == null){
            this.head = node;
            this.tail = node;
        } else {
            this.tail.next = node;
            this.tail = node;
        }
        this.size++;
    };


    this.contains = function(data) {
        if (this.head.data === data) 
            return this;
        var next = this.head.next;
        while (next !== null) {
            if (next.data === data) {
                return this;
            }
            next = next.next;
        }
        return null;
    };

    this.traverse = function() {
        var current = this.head;
        var toPrint = '';
        while (current !== null) {
            //callback.call(this, current); put callback as an argument to top function
            toPrint += current.data.toString() + ' ';
            current = current.next; 
        }
        console.log('list data: ',toPrint);
    }

    this.merge = function(list) {
        var current = this.head;
        var next = current.next;
        while (next !== null) {
            current = next;
            next = next.next;
        }
        current.next = list.head;
        this.size += list.size;
        return this;
    };

    this.reverse = function() {
      if (this.head == null) 
        return;
      if (this.head.next == null) 
        return;

      var currentNode = this.head;
      var nextNode = this.head.next;
      var prevNode = this.head;
      this.head.next = null;
      while (nextNode != null) {
        currentNode = nextNode;
        nextNode = currentNode.next;
        currentNode.next = prevNode;
        prevNode = currentNode;
      }
      this.head = currentNode;
      return this;
    }


}


/**
 * GENERAL HELPER FUNCTIONS
 */

function FindSet(x) {
    for (var i = 0; i < sets.length; i++){
        if (sets[i].contains(x) != null) {
            return sets[i].contains(x);
        }
    }
    return null;
}

function MergeSet(x,y) {
    var listA,listB;
    for (var i = 0; i < sets.length; i++){
        if (sets[i].contains(x) != null) {
            listA = sets[i].contains(x);
            sets.splice(i,1);
        }
    }
    for (var i = 0; i < sets.length; i++) {
        if (sets[i].contains(y) != null) {
            listB = sets[i].contains(y);
            sets.splice(i,1);
        }
    }
    var res = MergeLists(listA,listB);
    return res;

}


function MergeLists(listA, listB) {
    var listC = new LinkedList();
    listA.merge(listB);
    listC = listA;
    return listC;
}

//access matrix by i,j -> returns 'Y' or 'N'
function isFriend(matrix, pair){
    return matrix[pair[0]].charAt(pair[1]);
}

function k_combinations(set, k) {
    var i, j, combs, head, tailcombs;
    if (k > set.length || k <= 0) {
        return [];
    }
    if (k == set.length) {
        return [set];
    }
    if (k == 1) {
        combs = [];
        for (i = 0; i < set.length; i++) {
            combs.push([set[i]]);
        }
        return combs;
    }
    // Assert {1 < k < set.length}
    combs = [];
    for (i = 0; i < set.length - k + 1; i++) {
        head = set.slice(i, i+1);
        tailcombs = k_combinations(set.slice(i + 1), k - 1);
        for (j = 0; j < tailcombs.length; j++) {
            combs.push(head.concat(tailcombs[j]));
        }
    }
    return combs;
}
查看更多
素衣白纱
5楼-- · 2018-12-31 09:16

If what you want is to find all elementary circuits in a graph you can use the EC algorithm, by JAMES C. TIERNAN, found on a paper since 1970.

The very original EC algorithm as I managed to implement it in php (hope there are no mistakes is shown below). It can find loops too if there are any. The circuits in this implementation (that tries to clone the original) are the non zero elements. Zero here stands for non-existence (null as we know it).

Apart from that below follows an other implementation that gives the algorithm more independece, this means the nodes can start from anywhere even from negative numbers, e.g -4,-3,-2,.. etc.

In both cases it is required that the nodes are sequential.

You might need to study the original paper, James C. Tiernan Elementary Circuit Algorithm

<?php
echo  "<pre><br><br>";

$G = array(
        1=>array(1,2,3),
        2=>array(1,2,3),
        3=>array(1,2,3)
);


define('N',key(array_slice($G, -1, 1, true)));
$P = array(1=>0,2=>0,3=>0,4=>0,5=>0);
$H = array(1=>$P, 2=>$P, 3=>$P, 4=>$P, 5=>$P );
$k = 1;
$P[$k] = key($G);
$Circ = array();


#[Path Extension]
EC2_Path_Extension:
foreach($G[$P[$k]] as $j => $child ){
    if( $child>$P[1] and in_array($child, $P)===false and in_array($child, $H[$P[$k]])===false ){
    $k++;
    $P[$k] = $child;
    goto EC2_Path_Extension;
}   }

#[EC3 Circuit Confirmation]
if( in_array($P[1], $G[$P[$k]])===true ){//if PATH[1] is not child of PATH[current] then don't have a cycle
    $Circ[] = $P;
}

#[EC4 Vertex Closure]
if($k===1){
    goto EC5_Advance_Initial_Vertex;
}
//afou den ksana theoreitai einai asfales na svisoume
for( $m=1; $m<=N; $m++){//H[P[k], m] <- O, m = 1, 2, . . . , N
    if( $H[$P[$k-1]][$m]===0 ){
        $H[$P[$k-1]][$m]=$P[$k];
        break(1);
    }
}
for( $m=1; $m<=N; $m++ ){//H[P[k], m] <- O, m = 1, 2, . . . , N
    $H[$P[$k]][$m]=0;
}
$P[$k]=0;
$k--;
goto EC2_Path_Extension;

#[EC5 Advance Initial Vertex]
EC5_Advance_Initial_Vertex:
if($P[1] === N){
    goto EC6_Terminate;
}
$P[1]++;
$k=1;
$H=array(
        1=>array(1=>0,2=>0,3=>0,4=>0,5=>0),
        2=>array(1=>0,2=>0,3=>0,4=>0,5=>0),
        3=>array(1=>0,2=>0,3=>0,4=>0,5=>0),
        4=>array(1=>0,2=>0,3=>0,4=>0,5=>0),
        5=>array(1=>0,2=>0,3=>0,4=>0,5=>0)
);
goto EC2_Path_Extension;

#[EC5 Advance Initial Vertex]
EC6_Terminate:
print_r($Circ);
?>

then this is the other implementation, more independent of the graph, without goto and without array values, instead it uses array keys, the path, the graph and circuits are stored as array keys (use array values if you like, just change the required lines). The example graph start from -4 to show its independence.

<?php

$G = array(
        -4=>array(-4=>true,-3=>true,-2=>true),
        -3=>array(-4=>true,-3=>true,-2=>true),
        -2=>array(-4=>true,-3=>true,-2=>true)
);


$C = array();


EC($G,$C);
echo "<pre>";
print_r($C);
function EC($G, &$C){

    $CNST_not_closed =  false;                          // this flag indicates no closure
    $CNST_closed        = true;                         // this flag indicates closure
    // define the state where there is no closures for some node
    $tmp_first_node  =  key($G);                        // first node = first key
    $tmp_last_node  =   $tmp_first_node-1+count($G);    // last node  = last  key
    $CNST_closure_reset = array();
    for($k=$tmp_first_node; $k<=$tmp_last_node; $k++){
        $CNST_closure_reset[$k] = $CNST_not_closed;
    }
    // define the state where there is no closure for all nodes
    for($k=$tmp_first_node; $k<=$tmp_last_node; $k++){
        $H[$k] = $CNST_closure_reset;   // Key in the closure arrays represent nodes
    }
    unset($tmp_first_node);
    unset($tmp_last_node);


    # Start algorithm
    foreach($G as $init_node => $children){#[Jump to initial node set]
        #[Initial Node Set]
        $P = array();                   // declare at starup, remove the old $init_node from path on loop
        $P[$init_node]=true;            // the first key in P is always the new initial node
        $k=$init_node;                  // update the current node
                                        // On loop H[old_init_node] is not cleared cause is never checked again
        do{#Path 1,3,7,4 jump here to extend father 7
            do{#Path from 1,3,8,5 became 2,4,8,5,6 jump here to extend child 6
                $new_expansion = false;
                foreach( $G[$k] as $child => $foo ){#Consider each child of 7 or 6
                    if( $child>$init_node and isset($P[$child])===false and $H[$k][$child]===$CNST_not_closed ){
                        $P[$child]=true;    // add this child to the path
                        $k = $child;        // update the current node
                        $new_expansion=true;// set the flag for expanding the child of k
                        break(1);           // we are done, one child at a time
            }   }   }while(($new_expansion===true));// Do while a new child has been added to the path

            # If the first node is child of the last we have a circuit
            if( isset($G[$k][$init_node])===true ){
                $C[] = $P;  // Leaving this out of closure will catch loops to
            }

            # Closure
            if($k>$init_node){                  //if k>init_node then alwaya count(P)>1, so proceed to closure
                $new_expansion=true;            // $new_expansion is never true, set true to expand father of k
                unset($P[$k]);                  // remove k from path
                end($P); $k_father = key($P);   // get father of k
                $H[$k_father][$k]=$CNST_closed; // mark k as closed
                $H[$k] = $CNST_closure_reset;   // reset k closure
                $k = $k_father;                 // update k
        }   } while($new_expansion===true);//if we don't wnter the if block m has the old k$k_father_old = $k;
        // Advance Initial Vertex Context
    }//foreach initial


}//function

?>

I have analized and documented the EC but unfortunately the documentation is in Greek.

查看更多
爱死公子算了
6楼-- · 2018-12-31 09:17

The DFS-based variants with back edges will find cycles indeed, but in many cases it will NOT be minimal cycles. In general DFS gives you the flag that there is a cycle but it is not good enough to actually find cycles. For example, imagine 5 different cycles sharing two edges. There is no simple way to identify cycles using just DFS (including backtracking variants).

Johnson's algorithm is indeed gives all unique simple cycles and has good time and space complexity.

But if you want to just find MINIMAL cycles (meaning that there may be more then one cycle going through any vertex and we are interested in finding minimal ones) AND your graph is not very large, you can try to use the simple method below. It is VERY simple but rather slow compared to Johnson's.

So, one of the absolutely easiest way to find MINIMAL cycles is to use Floyd's algorithm to find minimal paths between all the vertices using adjacency matrix. This algorithm is nowhere near as optimal as Johnson's, but it is so simple and its inner loop is so tight that for smaller graphs (<=50-100 nodes) it absolutely makes sense to use it. Time complexity is O(n^3), space complexity O(n^2) if you use parent tracking and O(1) if you don't. First of all let's find the answer to the question if there is a cycle. The algorithm is dead-simple. Below is snippet in Scala.

  val NO_EDGE = Integer.MAX_VALUE / 2

  def shortestPath(weights: Array[Array[Int]]) = {
    for (k <- weights.indices;
         i <- weights.indices;
         j <- weights.indices) {
      val throughK = weights(i)(k) + weights(k)(j)
      if (throughK < weights(i)(j)) {
        weights(i)(j) = throughK
      }
    }
  }

Originally this algorithm operates on weighted-edge graph to find all shortest paths between all pairs of nodes (hence the weights argument). For it to work correctly you need to provide 1 if there is a directed edge between the nodes or NO_EDGE otherwise. After algorithm executes, you can check the main diagonal, if there are values less then NO_EDGE than this node participates in a cycle of length equal to the value. Every other node of the same cycle will have the same value (on the main diagonal).

To reconstruct the cycle itself we need to use slightly modified version of algorithm with parent tracking.

  def shortestPath(weights: Array[Array[Int]], parents: Array[Array[Int]]) = {
    for (k <- weights.indices;
         i <- weights.indices;
         j <- weights.indices) {
      val throughK = weights(i)(k) + weights(k)(j)
      if (throughK < weights(i)(j)) {
        parents(i)(j) = k
        weights(i)(j) = throughK
      }
    }
  }

Parents matrix initially should contain source vertex index in an edge cell if there is an edge between the vertices and -1 otherwise. After function returns, for each edge you will have reference to the parent node in the shortest path tree. And then it's easy to recover actual cycles.

All in all we have the following program to find all minimal cycles

  val NO_EDGE = Integer.MAX_VALUE / 2;

  def shortestPathWithParentTracking(
         weights: Array[Array[Int]],
         parents: Array[Array[Int]]) = {
    for (k <- weights.indices;
         i <- weights.indices;
         j <- weights.indices) {
      val throughK = weights(i)(k) + weights(k)(j)
      if (throughK < weights(i)(j)) {
        parents(i)(j) = parents(i)(k)
        weights(i)(j) = throughK
      }
    }
  }

  def recoverCycles(
         cycleNodes: Seq[Int], 
         parents: Array[Array[Int]]): Set[Seq[Int]] = {
    val res = new mutable.HashSet[Seq[Int]]()
    for (node <- cycleNodes) {
      var cycle = new mutable.ArrayBuffer[Int]()
      cycle += node
      var other = parents(node)(node)
      do {
        cycle += other
        other = parents(other)(node)
      } while(other != node)
      res += cycle.sorted
    }
    res.toSet
  }

and a small main method just to test the result

  def main(args: Array[String]): Unit = {
    val n = 3
    val weights = Array(Array(NO_EDGE, 1, NO_EDGE), Array(NO_EDGE, NO_EDGE, 1), Array(1, NO_EDGE, NO_EDGE))
    val parents = Array(Array(-1, 1, -1), Array(-1, -1, 2), Array(0, -1, -1))
    shortestPathWithParentTracking(weights, parents)
    val cycleNodes = parents.indices.filter(i => parents(i)(i) < NO_EDGE)
    val cycles: Set[Seq[Int]] = recoverCycles(cycleNodes, parents)
    println("The following minimal cycle found:")
    cycles.foreach(c => println(c.mkString))
    println(s"Total: ${cycles.size} cycle found")
  }

and the output is

The following minimal cycle found:
012
Total: 1 cycle found
查看更多
余生无你
7楼-- · 2018-12-31 09:21

To clarify:

  1. Strongly Connected Components will find all subgraphs that have at least one cycle in them, not all possible cycles in the graph. e.g. if you take all strongly connected components and collapse/group/merge each one of them into one node (i.e. a node per component), you'll get a tree with no cycles (a DAG actually). Each component (which is basically a subgraph with at least one cycle in it) can contain many more possible cycles internally, so SCC will NOT find all possible cycles, it will find all possible groups that have at least one cycle, and if you group them, then the graph will not have cycles.

  2. to find all simple cycles in a graph, as others mentioned, Johnson's algorithm is a candidate.

查看更多
登录 后发表回答