Implementing Depth First Search into C# using List

2019-01-16 12:08发布

问题:

I want to create a depth first search which I have been somewhat successful in.

Here is my code so far (Except my constructor, note the Vertex and Edge classes only contain properties, nothing important to post here):

private Stack<Vertex> workerStack = new Stack<Vertex>();
private List<Vertex> vertices = new List<Vertex>();
private List<Edge> edges = new List<Edge>();

private int numberOfVertices;
private int numberOfClosedVertices;
private int visitNumber = 1;

private void StartSearch()
{
    // Make sure to visit all vertices
    while (numberOfClosedVertices < numberOfVertices && workerStack.Count > 0)
    {
        // Get top element in stack and mark it as visited
        Vertex workingVertex = workerStack.Pop();
        workingVertex.State = State.Visited;

        workingVertex.VisitNumber = visitNumber;
        visitNumber++;

        numberOfClosedVertices++;

        // Get all edges connected to the working vertex
        foreach (Vertex vertex in GetConnectedVertices(workingVertex))
        {
            vertex.Parent = workingVertex;
            workerStack.Push(vertex);
        }
    }
}

private List<Vertex> GetConnectedVertices(Vertex vertex)
{
    List<Vertex> vertices = new List<Vertex>();

    // Get all vertices connected to vertex and is unvisited, then add them to the vertices list
    edges.FindAll(edge => edge.VertexSource == vertex && edge.VertexTarget.State == State.Unvisited).ForEach(edge => vertices.Add(edge.VertexTarget));

    return vertices;
}

Its working in the way that all vertices get visited, but not in the right order.

Here is a comparison of how mine gets visited compared to wikipedia:

Its seems mine is turned around and is beginning from right to left.

Do you know what causes it? (Also any advice on my implementation would be greatly appreciated)

Thanks

EDIT: I got my answer, but still wanted to show the end result for the GetConnectedVertices method:

private List<Vertex> GetConnectedVertices(Vertex vertex)
{
    List<Vertex> connectingVertices = new List<Vertex>();

    (from edge in edges
     where edge.VertexSource == vertex && edge.VertexTarget.State == State.Unvisited
     select edge).
     Reverse().
     ToList().
     ForEach(edge => connectingVertices.Add(edge.VertexTarget));

    return connectingVertices;
}

回答1:

It seems mine is turned around and is beginning from right to left. Do you know what causes it?

As others have noted, you are pushing the nodes-to-visit-next on the stack in order from left to right. That means they get popped off right-to-left, since a stack reverses the order. Stacks are last-in-first-out.

You can fix the problem by making GetConnectedVertices build a stack, not a list. That way the connected vertices are reversed twice, once when they go on the returned stack and once when they go on the real stack.

Also any advice on my implementation would be greatly appreciated

The implementation works, I suppose, but it has a great many fundamental problems. If I were presented that code for review, here's what I'd say:

First off, suppose you wanted to do two depth-first searches of this data structure at the same time. Either because you were doing it on multiple threads, or because you have a nested loop in which the inner loop does a DFS for a different element than the outer loop. What happens? They interfere with each other because both try to mutate the "State" and "VisitNumber" fields. It is a really bad idea to have what should be a "clean" operation like searching actually make your data structure "dirty".

Doing so also makes it impossible for you to use persistent immutable data to represent redundant portions of your graph.

Also, I notice that you omit the code that cleans up. When is "State" ever set back to its original value? What if you did a second DFS? It would immediately fail since the root is already visited.

A better choice for all these reasons is to keep the "visited" state in its own object, not in each vertex.

Next, why are all the state objects private variables of a class? This is a simple algorithm; there's no need to build an entire class for it. A depth first search algorithm should take the graph to search as a formal parameter, not as object state, and it should maintain its own local state as necessary in local variables, not fields.

Next, the abstraction of the graph is... well, its not an abstraction. It's two lists, one of vertices and one of edges. How do we know that those two lists are even consistent? Suppose there are vertices that are not in the vertex list but are on the edge list. How do you prevent that? What you want is a graph abstraction. Let the graph abstraction implementation worry about how to represent edges and find neighbours.

Next, your use of ForEach is both legal and common, but it makes my head hurt. It is hard to read your code and reason about it with all the lambdas. We have a perfectly good "foreach" statement. Use it.

Next, you are mutating a "parent" property but it is not at all clear what this property is for or why it is being mutated during a traversal. Vertices in an arbitrary graph do not have "parents" unless the graph is a tree, and if the graph is a tree then there is no need to keep track of the "visited" state; there are no loops in a tree. What is going on here? This code is just bizarre, and it is not necessary to perform a DFS.

Next, your helper method named GetConnectedVertices is a lie. It does not get connected vertices, it gets connected not-already-visited vertices. Methods whose names lie are very confusing.

Finally, this claims to be a depth first search but it doesn't search for anything! Where is the thing being searched for? Where is the result returned? This isn't a search at all, it's a traversal.

Start over. What do you want? A depth-first traversal of a graph given a starting vertex. Then implement that. Start by defining what you are traversing. A graph. What service do you need from a graph? A way of getting the set of neighbouring vertices:

interface IGraph
{
    IEnumerable<Vertex> GetNeighbours(Vertex v);
}

What is your method returning? A sequence of Vertices in depth-first order. What does it take? A starting vertex. OK:

static class Extensions
{
    public static IEnumerable<Vertex> DepthFirstTraversal(
        this IGraph graph, 
        Vertex start) 
    { ... }
}

We now have a trivial implementation of depth first search; you can now use the Where clause:

IGraph myGraph = whatever;
Vertex start = whatever;
Vertex result = myGraph.DepthFirstTraversal(start)
                       .Where(v=>something)
                       .FirstOrDefault();

OK, so how are we going to implement that method so it does a traversal without wrecking the graph's state? Maintain your own external state:

public static IEnumerable<Vertex> DepthFirstTraversal(
    this IGraph graph, 
    Vertex start) 
{
    var visited = new HashSet<Vertex>();
    var stack = new Stack<Vertex>();

    stack.Push(start);

    while(stack.Count != 0)
    {
        var current = stack.Pop();

        if(!visited.Add(current))
            continue;

        yield return current;

        var neighbours = graph.GetNeighbours(current)
                              .Where(n=>!visited.Contains(n));

        // If you don't care about the left-to-right order, remove the Reverse
        foreach(var neighbour in neighbours.Reverse()) 
            stack.Push(neighbour);
    }
}

See how much cleaner and shorter that is? No mutation of state. No mucking around with edge lists. No badly-named helper functions. And the code actually does what it says it does: traverses a graph.

We also get the benefits of iterator blocks; namely, if someone is using this for a DF search, then the iteration is abandoned when the search criteria are met. We don't have to do a full traversal if we find the result early.



回答2:

I generalized @Eric's code for DFS traversal for any T to make things work for any type that has children - I thought I'd share:

public static IEnumerable<T> DepthFirstTraversal<T>(
    T start,
    Func<T, IEnumerable<T>> getNeighbours)
{
    var visited = new HashSet<T>();
    var stack = new Stack<T>();
    stack.Push(start);

    while (stack.Count != 0)
    {
        var current = stack.Pop();
        visited.Add(current);
        yield return current;

        var neighbours = getNeighbours(current).Where(node => !visited.Contains(node));

        // If you don't care about the left-to-right order, remove the Reverse
        foreach(var neighbour in neighbours.Reverse())
        {
            stack.Push(neighbour);
        }
    }
}

Example usage:

var nodes = DepthFirstTraversal(myNode, n => n.Neighbours);


回答3:

The problem is in the order you search the elements. Your for each in StartSearch does not guarantee element order. Neither does you FindAll in GetConnectedVertices method. Let's look at this line:

edges.FindAll(edge => edge.VertexSource == vertex && edge.VertexTarget.State == State.Unvisited).ForEach(edge => vertices.Add(edge.VertexTarget));

You should add an OrderBy() to ensure the desired order.



回答4:

Items will be popped of the stack in the reverse order as how they are pushed on it:

stach.push() results in: 1 2 3 4 5

stack.pop() results in: 5 4 3 2 1 (so: from right to left)



回答5:

You might enjoy this:

        public static bool DepthFirstSearch<T>(this IEnumerable<T> vertices, T rootVertex, T targetVertex, Func<T, IEnumerable<T>> getConnectedVertices, Func<T, T, bool> matchFunction = null)
    {
        if (getConnectedVertices == null)
        {
            throw new ArgumentNullException("getConnectedVertices");
        }
        if (matchFunction == null)
        {
            matchFunction = (t, u) => object.Equals(t, u);
        }
        var directlyConnectedVertices = getConnectedVertices(rootVertex);
        foreach (var vertex in directlyConnectedVertices)
        {
            if (matchFunction(vertex, targetVertex))
            {
                return true;
            }
            else if (vertices.DepthFirstSearch(vertex, targetVertex, getConnectedVertices, matchFunction))
            {
                return true;
            }
        }
        return false;
    }


回答6:

This is my implemenation, one stack is good enough. A reverse is done before the foreach loop.

    /// <summary>
    /// Depth first search implementation in c#
    /// </summary>
    /// <typeparam name="T">Type of tree structure item</typeparam>
    /// <typeparam name="TChilds">Type of childs collection</typeparam>
    /// <param name="node">Starting node to search</param>
    /// <param name="ChildsProperty">Property to return child node</param>
    /// <param name="Match">Predicate for matching</param>
    /// <returns>The instance of matched result, null if not found</returns>
    public static T DepthFirstSearch<T, TChilds>(this T node, Func<T, TChilds> ChildsProperty, Predicate<T> Match) 
        where T:class
    {
        if (!(ChildsProperty(node) is IEnumerable<T>))
            throw new ArgumentException("ChildsProperty must be IEnumerable<T>");

        Stack<T> stack = new Stack<T>();
        stack.Push(node);
        while (stack.Count > 0) {
            T thisNode = stack.Pop();
            #if DEBUG
            System.Diagnostics.Debug.WriteLine(thisNode.ToString());
            #endif
            if (Match(thisNode))
                return thisNode;
            if (ChildsProperty(thisNode) != null) {
                foreach (T child in (ChildsProperty(thisNode) as IEnumerable<T>).Reverse()) 
                    stack.Push(child);
            }
        }
        return null;
    }