I know there are a quite some answers existing on this question. However, I found none of them really bringing it to the point.
Some argue that a cycle is (almost) the same as a strongly connected components (s. Finding all cycles in a directed graph) , so one could use algorithms designed for that goal.
Some argue that finding a cycle can be done via DFS and checking for back-edges (s. boost graph documentation on file dependencies).
I now would like to have some suggestions on whether all cycles in a graph can be detected via DFS and checking for back-edges?
http://www.me.utexas.edu/~bard/IP/Handouts/cycles.pdf (found here on S.O.) states one methode based on cycle bases. Me personally, I don't find it very intuitive so I'm looking for a different solution.
EDIT: My initial opinion was apparently wrong. S. next answer by "Moron".
Initial opinion:
My opinion is that it indeed could work that way as DFS-VISIT (s. pseudocode of DFS) freshly enters each node that was not yet visited. In that sense, each vertex exhibits a potential start of a cycle. Additionally, as DFS visits each edge once, each edge leading to the starting point of a cycle is also covered. Thus, by using DFS and back-edge checking it should indeed be possible to detect all cycles in a graph. Note that, if cycles with different numbers of participant nodes exist (e.g. triangles, rectangles etc.), additional work has to be done to discriminate the acutal "shape" of each cycle.
问题:
回答1:
I have already answered this thoroughly, so check this:
Will a source-removal sort always return a maximal cycle?
The relevant part of the answer:
Perform a Depth-First Search on your graph.
You are interested in recognizing back edges, i.e., in the traversal, an edge which points back to an ancestor (in the DFS tree, which is induced by edges of visiting nodes for the first time) of the visited node. For example, if the DFS stack has nodes [A->B->C->D] and while you explore D you find an edge D->B, that's a back edge. Each back edge defines a cycle.
More importantly, the cycles induced by back-edges are a basic set of cycles of the graph. "A basic set of cycles": you can construct all cycles of the graph just by UNIONing and XORing cycles of the basic set. For example, consider the cycles [A1->A2->A3->A1] and [A2->B1->B2->B3->A2]. You can union them to the cycle: [A1->A2->B1->B2->B3->A2->A3->A1].
回答2:
Maybe this can help you somehow, I found this site where a colored dfs for directed graph is described. So you can consider correct the dfs translation to php I present here.
What I added is a part to create a forest and an other part to find all cycles. So please consider that it is not safe to take these two parts of my code as correct for sure.
One with knowledge on graph theory might be able to test for sure. There are no comments in the dfs part because it is described already in the reference site. I suggest that you take an example with more than one tree and draw the forest (need 4 colors) in a paper to better understand.
This is the code:
<?php
//define the graph
$g = Array(
1 => Array(1,2,3,4,5),
2 => Array(1,2,3,4,5),
3 => Array(1,2,3,4,5),
4 => Array(1,2,3,4,5),
5 => Array(1,2,3,4,5)
);
//add needed attributes on the graph
$G = Array();
foreach($g as $name => $children)
{
$child = Array();
foreach($children as $child_name)//attaching a v letter is not needed, just a preference
$child['v'.$child_name] = null;
$G['v'.$name] =
Array('child'=>$child,
'color'=>'WHITE',//is used in dfs to make visit
'discover_time'=>null,//is used in dfs to classify edges
'finish_time'=>null,//is used in dfs to classify edges
'father'=>null,//is used to walk backward to the start of a cycle
'back_edge'=>null//can be omited, this information can be found with in_array(info, child_array)
);
}
new test($G);
class test
{
private $G = Array();//graph
private $C = Array();//cycles
private $F = Array();//forest
private $L = Array();//loops
private $time_counter = 0;
public function __construct($G)
{
$this->G = $G;
foreach($this->G as $node_name => $foo)
{
if($this->G[$node_name]['color'] === 'WHITE')
$this->DFS_Visit($node_name);
}
$tree =array();
foreach($this->G as $node_name => $data)
{
if($data['father'] === null)//new tree found
{
$this->F[] = $tree;//at first an empty array is inserted
$tree = Array();
$tree[$node_name] = $data;
}
else
$tree[$node_name] = $data;
}
array_shift($this->F);//remove the empty array
$this->F[] = $tree;//add the last tree
$this->find_all_elementary_cycles();
file_put_contents('dump.log', count($this->L)." Loops found: \n", FILE_APPEND);
file_put_contents('dump.log', print_r($this->L,true)."\n", FILE_APPEND);
file_put_contents('dump.log', count($this->C)." Cycles found: \n", FILE_APPEND);
file_put_contents('dump.log', print_r($this->C,true)."\n", FILE_APPEND);
file_put_contents('dump.log', count($this->F)." trees found in the Forest: \n", FILE_APPEND);
file_put_contents('dump.log', print_r($this->F,true)."\n", FILE_APPEND);
}
public function find_all_elementary_cycles()
{
/*** For each tree of the forest ***/
foreach($this->F as $tree)
{
/*** Foreach node in the tree ***/
foreach($tree as $node_name => $node)
{
/*** If this tree node connects to some child with backedge
(we hope to avoid some loops with this if) ***/
if ( $node['back_edge'] === true )
/*** Then for every child ***/
foreach ( $node['child'] as $child_name => $edge_classification)
if($edge_classification === 'BACK_EDGE')
$this->back_edge_exploit($node_name, $child_name, $tree);
}
}
}
private function back_edge_exploit ($back_edge_sender, $back_edge_receiver, $tree)
{
/*** The input of this function is a back edge, a back edge is defined as follows
-a sender node: which stands lower in the tree and a reciever node which of course stands higher
***/
/*** We want to get rid of loops, so check for a loop ***/
if($back_edge_sender == $back_edge_receiver)
return $this->L[] = $back_edge_sender;//we need to return cause no there is no cycle in a loop
/*** For this backedge sender node the backedge reciever might send a direct edge to the sender ***/
if( isset($this->G[$back_edge_receiver]['child'][$back_edge_sender]) > 0 )
/*** This edge that the reciever sends could be a tree edge, this will happen
-in the case that: the backedge reciever is a father, but it can be a forward edge
-in this case: for the forward edge to exist the backedge reciever does not have to be
-a father onlly: it can also be an ancestore. Whatever of both cases, we have a cycle
***/
if( $this->G[$back_edge_receiver]['child'][$back_edge_sender] === 'TREE_EDGE' or
$this->G[$back_edge_receiver]['child'][$back_edge_sender] === 'FORWARD_EDGE')
$this->C[md5(serialize(Array($back_edge_receiver,$back_edge_sender)))]
= Array($back_edge_receiver,$back_edge_sender);
/*** Until now we have covered, loops, cycles of the kind father->child which occur from one tree edge
- and one: back edge combination, and also we have covered cycles of the kind ancestore->descendant
-which: occur from the combination of one forward edge and one backedge (of course might happen that
-a father: can send to the child both forward and tree edge, all these are covered already).
-what are left: are the cycles of the combination of more than one tree edges and one backedge ***/
$cycle = Array();
attach_node://loops must be handled before this, otherwise goto will loop continously
$cycle[] = $back_edge_sender; //enter the backedge sender
$back_edge_sender = $tree[$back_edge_sender]['father']; //backedge sender becomes his father
if($back_edge_sender !== $back_edge_receiver) //if backedge sender has not become backedge reciever yet
goto attach_node;//the loop again
$cycle[] = $back_edge_receiver;
$cycle = array_reverse($cycle);
$this->C[md5(serialize($cycle))] = $cycle;
}
private function DFS_Visit($node_name)
{
$this->G[$node_name]['color'] = 'GRAY';
$this->G[$node_name]['discover_time'] = $this->time_counter++;
foreach($this->G[$node_name]['child'] as $child_name => $foo)
{
if($this->G[$child_name]['color'] === 'BLACK') {#child black
if( $this->G[$node_name]['discover_time'] <
$this->G[$child_name]['discover_time'] ){#time of father smaller
$this->G[$node_name]['child'][$child_name] = 'FORWARD_EDGE';
}
else{#time of child smaller
$this->G[$node_name]['child'][$child_name] = 'CROSS_EDGE';
}
}
elseif($this->G[$child_name]['color'] === 'WHITE'){#child white
$this->G[$node_name]['child'][$child_name] = 'TREE_EDGE';
$this->G[$child_name]['father'] = $node_name;#father in the tree
$this->DFS_Visit($child_name);
}#child discovered but not explored (and father discovered)
elseif($this->G[$child_name]['color'] === 'GRAY'){#child gray
$this->G[$node_name]['child'][$child_name] = 'BACK_EDGE';
$this->G[$node_name]['back_edge'] = true;
}
}//for
$this->G[$node_name]['color'] = 'BLACK';//fully explored
$this->G[$node_name]['finish_time'] = $this->time_counter++;
}
}
?>
And this is the output:
5 Loops found: Array (
[0] => v1
[1] => v2
[2] => v3
[3] => v4
[4] => v5 )
16 Cycles found: Array (
[336adbca89b3389a6f9640047d07f24a] => Array
(
[0] => v1
[1] => v2
)
[d68df8cdbc98d846a591937e9dd9cd70] => Array
(
[0] => v1
[1] => v3
)
[cad6b68c862d3a00a35670db31b76b67] => Array
(
[0] => v1
[1] => v2
[2] => v3
)
[1478f02ce1faa31e122a61a88af498e4] => Array
(
[0] => v2
[1] => v3
)
[0fba8cccc8dceda9fe84c3c93c20d057] => Array
(
[0] => v1
[1] => v4
)
[c995c93b92f8fe8003ea77617760a0c9] => Array
(
[0] => v1
[1] => v2
[2] => v3
[3] => v4
)
[8eae017bc12f0990ab42196af0a1f6a8] => Array
(
[0] => v2
[1] => v4
)
[57c0cc445b506ba6d51dc3c2f06fd926] => Array
(
[0] => v2
[1] => v3
[2] => v4
)
[18cef1bbe850dca5d2d7b6bfea795a23] => Array
(
[0] => v3
[1] => v4
)
[e0bd0c51bfa4df20e4ad922f57f6fe0d] => Array
(
[0] => v1
[1] => v5
)
[6a8b7681b160e28dd86f3f8316bfa16e] => Array
(
[0] => v1
[1] => v2
[2] => v3
[3] => v4
[4] => v5
)
[85e95d3e4dc97e066ec89752946ccf0c] => Array
(
[0] => v2
[1] => v5
)
[633c7cf8df43df75a24c104d9de09ece] => Array
(
[0] => v2
[1] => v3
[2] => v4
[3] => v5
)
[769f8ebc0695f46b5cc3cd444be2938a] => Array
(
[0] => v3
[1] => v5
)
[87028339e63fd6c2687dc5488ba0818c] => Array
(
[0] => v3
[1] => v4
[2] => v5
)
[c2b28cdcef48362ceb0d8fb36a142254] => Array
(
[0] => v4
[1] => v5
)
)
1 trees found in the Forest: Array (
[0] => Array
(
[v1] => Array
(
[child] => Array
(
[v1] => BACK_EDGE
[v2] => TREE_EDGE
[v3] => FORWARD_EDGE
[v4] => FORWARD_EDGE
[v5] => FORWARD_EDGE
)
[color] => BLACK
[discover_time] => 0
[finish_time] => 9
[father] =>
[back_edge] => 1
)
[v2] => Array
(
[child] => Array
(
[v1] => BACK_EDGE
[v2] => BACK_EDGE
[v3] => TREE_EDGE
[v4] => FORWARD_EDGE
[v5] => FORWARD_EDGE
)
[color] => BLACK
[discover_time] => 1
[finish_time] => 8
[father] => v1
[back_edge] => 1
)
[v3] => Array
(
[child] => Array
(
[v1] => BACK_EDGE
[v2] => BACK_EDGE
[v3] => BACK_EDGE
[v4] => TREE_EDGE
[v5] => FORWARD_EDGE
)
[color] => BLACK
[discover_time] => 2
[finish_time] => 7
[father] => v2
[back_edge] => 1
)
[v4] => Array
(
[child] => Array
(
[v1] => BACK_EDGE
[v2] => BACK_EDGE
[v3] => BACK_EDGE
[v4] => BACK_EDGE
[v5] => TREE_EDGE
)
[color] => BLACK
[discover_time] => 3
[finish_time] => 6
[father] => v3
[back_edge] => 1
)
[v5] => Array
(
[child] => Array
(
[v1] => BACK_EDGE
[v2] => BACK_EDGE
[v3] => BACK_EDGE
[v4] => BACK_EDGE
[v5] => BACK_EDGE
)
[color] => BLACK
[discover_time] => 4
[finish_time] => 5
[father] => v4
[back_edge] => 1
)
)
)
EDIT 1 :
The back_edge_exploit
method could be erplaced by this version:
![private function back_edge_exploit ($back_edge_sender, $back_edge_receiver, $tree)
{
/*** The input of this function is a back edge, a back edge is defined as follows
-a sender node: which stands lower in the tree and a reciever node which of course stands higher
***/
/*** We want to get rid of loops, so check for a loop ***/
if($back_edge_sender == $back_edge_receiver)
return $this->L\[\] = $back_edge_sender;//we need to return cause no there is no cycle in a loop
$cycle = Array();//There is always a cycle which is a combination of tree edges and on backedge
$edges_count = 0; //If the cycle has more than 2 nodes we need to check for forward edge
$backward_runner = $back_edge_sender;//this walks backward up to the start of cycle
attach_node://loops must be handled before this, otherwise goto will loop continously
$edges_count++;
$cycle\[\] = $backward_runner; //enter the backedge sender
$backward_runner = $tree\[$backward_runner\]\['father'\]; //backedge sender becomes his father
if($backward_runner !== $back_edge_receiver) //if backedge sender has not become backedge reciever yet
goto attach_node;//the loop again
else
$cycle\[\] = $backward_runner;
if($edges_count>1 and $this->G\[$back_edge_receiver\]\['child'\]\[$back_edge_sender\] === 'FORWARD_EDGE' )
$this->C\[\] = Array($back_edge_receiver,$back_edge_sender);
$this->C\[\] = array_reverse($cycle); //store the tree edge->back_edge cycle
}][2]
EDIT 2:
I have to say that I have found that back_edge_exploit
is insufficient it will only find cycles that are made with tree edges and one back edge.
**** Edit 3: **** Although this solutions is found to be incomplete, it has some useful information, even the insufficiency it self is a piece of information, so I think it might be useful to keep it. But the main reason I have edited my answer is that I have found an other solution which I am going to present below.
But before that an other notice could be made about the dfs aproach, there are cycles that could occur by walking any valid combination all cross,forward,tree,back edge. So finding the actual cycles relying on the dfs information is not trivial (requires extra code), consider this example:
As long as the new solution is concerned it is described in an old paper of 1970, by James C. Tiernan (Check this link) as an efficient algorithm for finding all elementary cycles in a directed graph. Also there is a modern implementation of this without goto See it here
My implementation of Elementary Cycle Algorithm (that's the name) is in php and is as close to the original as possible . I have already checked it and it works. It is about directed graphs, if you declare you graph so that there is a directed cycle v1->v2->v3 and an other one v2->v3->v1 then both cycles will be found which is logical since it works on directed graphs.
As for undirected graphs the author suggests other algorithms which make a better solution than modifying the algorithms, however it can be done by modifying the graph definition and by adding an extra check for cycles of length 2 which are considered as undirected edges. In particular an undirected cycle of three nodes would be defined twice in the definition, once per orientation, like this: v1->v2->v3 and v3->v2->v1. Then the algorithm will find it twice, once per orientation, then a modification of a single line on EC3_Circuit_Confirmation
can cut the extra one.
The nodes are defined sequentially, one can change the constant first
and the adjacency list to make the first node count from zero or one.
This is the php code for the EC Algorithm of Tiernan :
<?php
define(first,1); //Define how to start counting, from 0 or 1
//nodes are considered to be sequential
$G[first] = Array(2); $G[] = Array(1,3); $G[] = Array(4); $G[] = Array(1);
$N=key(array_slice($G, -1, 1, TRUE));//last key of g
$H=Array(Array());
$P = Array();
$P[first] = first;
$k = first;
$C = Array();//cycles
$L = Array();//loops
########################## ALGORITHM START #############################
#[Path Extension]
EC2_Path_Extension:
//scan adjacency list
foreach($G[$P[$k]] as $j => $adj_node)
//if there is an adjacent node bigger than P[1] and this nodes does not belong in P
if( ($adj_node > $P[first]) and (in_array($adj_node, $P))===false and
(count($H[$P[$k]])>0 and in_array($adj_node, $H[$P[$k]]))===false)
{
$k++;
$P[$k] = $G[$P[$k-1]][$j];
goto EC2_Path_Extension;
}
#[EC3 Circuit Confirmation]
EC3_Circuit_Confirmation:
if(!in_array($P[first], $G[$P[$k]]))
goto EC4_Vertex_Closure;
//otherwise
if (count($P)===1)
$L[] = current($P);
else
$C[] = implode($P);
#[EC4 Vertex Closure]
EC4_Vertex_Closure:
if($k===first)
goto EC5_Advance_Initial_Vertex;
$H[$P[$k]] = Array();
$H[$P[$k-1]][] = $P[$k];
unset($P[$k]);
$k--;
goto EC2_Path_Extension;
#[EC5 Advance Initial Vertex]
EC5_Advance_Initial_Vertex:
if($P[first] === $N)
goto EC6_Terminate;
$P[first]++;
$k=first;
//Reset H
$H=Array(Array());
goto EC2_Path_Extension;
EC6_Terminate:
########################### ALGORITHM END ##################################
echo "\n\n".count($L)."$count loops found: ".implode(", ",$L)."\n\n";
echo count($C)." cycles found!\n".implode("\n",$C)."\n";
/*** Original graph found in the paper ***/
//$G[first] = Array(2); $G[] = Array(2,3,4);
//$G[] = Array(5); $G[] = Array(3); $G[] = Array(1);
?>
回答3:
My suggestion is to use Tarjan's algorithm to find set of strongly connected components, and Hierholzer's algorithm to find all cycles in the strongly connected component.
Here is the link for the Java implementation with a test case: http://stones333.blogspot.com/2013/12/find-cycles-in-directed-graph-dag.html
回答4:
If a traversal algorithm visits each edge only once, then it cannot find all cycles. An edge could be part of multiple cycles.
btw, what is a back-edge?
Also, perhaps you should rephrase/format your question. It is very hard to read.