Stable topological sort

2019-02-04 05:50发布

问题:

Let say I have a graph where the nodes is stored in a sorted list. I now want to topological sort this graph while keeping the original order where the topological order is undefined. Are there any good algorithms for this?

回答1:

One possibility is to compute the lexicographically least topological order. The algorithm is to maintain a priority queue containing the nodes whose effective in-degree (over nodes not yet processed) is zero. Repeatedly dequeue the node with the least label, append it to the order, decrement the effective in-degrees of its successors, enqueue the ones that now have in-degree zero. This produces 1234567890 on btilly's example but does not in general minimize inversions.

The properties I like about this algorithm are that the output has a clean definition obviously satisfied by only one order and that, whenever there's an inversion (node x appears after node y even though x < y), x's largest dependency is larger than y's largest dependency, which is an "excuse" of sorts for inverting x and y. A corollary is that, in the absence of constraints, the lex least order is sorted order.



回答2:

The problem is two-fold:

  • Topological sort
  • Stable sort

After many errors and trials I came up with a simple algorithm that resembles bubble sort but with topological order criteria.

I thoroughly tested the algorithm on full graphs with complete edge combinations so it can be considered as proven.

Cyclic dependencies are tolerated and resolved according to original order of elements in sequence. The resulting order is perfect and represents the closest possible match.

Here is the source code in C#:

static class TopologicalSort
{
    /// <summary>
    /// Delegate definition for dependency function.
    /// </summary>
    /// <typeparam name="T">The type.</typeparam>
    /// <param name="a">The A.</param>
    /// <param name="b">The B.</param>
    /// <returns>
    /// Returns <c>true</c> when A depends on B. Otherwise, <c>false</c>.
    /// </returns>
    public delegate bool TopologicalDependencyFunction<in T>(T a, T b);

    /// <summary>
    /// Sorts the elements of a sequence in dependency order according to comparison function with Gapotchenko algorithm.
    /// The sort is stable. Cyclic dependencies are tolerated and resolved according to original order of elements in sequence.
    /// </summary>
    /// <typeparam name="T">The type of the elements of source.</typeparam>
    /// <param name="source">A sequence of values to order.</param>
    /// <param name="dependencyFunction">The dependency function.</param>
    /// <param name="equalityComparer">The equality comparer.</param>
    /// <returns>The ordered sequence.</returns>
    public static IEnumerable<T> StableOrder<T>(
        IEnumerable<T> source,
        TopologicalDependencyFunction<T> dependencyFunction,
        IEqualityComparer<T> equalityComparer)
    {
        if (source == null)
            throw new ArgumentNullException("source");
        if (dependencyFunction == null)
            throw new ArgumentNullException("dependencyFunction");
        if (equalityComparer == null)
            throw new ArgumentNullException("equalityComparer");

        var graph = DependencyGraph<T>.TryCreate(source, dependencyFunction, equalityComparer);
        if (graph == null)
            return source;

        var list = source.ToList();
        int n = list.Count;

    Restart:
        for (int i = 0; i < n; ++i)
        {
            for (int j = 0; j < i; ++j)
            {
                if (graph.DoesXHaveDirectDependencyOnY(list[j], list[i]))
                {
                    bool jOnI = graph.DoesXHaveTransientDependencyOnY(list[j], list[i]);
                    bool iOnJ = graph.DoesXHaveTransientDependencyOnY(list[i], list[j]);

                    bool circularDependency = jOnI && iOnJ;

                    if (!circularDependency)
                    {
                        var t = list[i];
                        list.RemoveAt(i);

                        list.Insert(j, t);
                        goto Restart;
                    }
                }
            }
        }

        return list;
    }

    /// <summary>
    /// Sorts the elements of a sequence in dependency order according to comparison function with Gapotchenko algorithm.
    /// The sort is stable. Cyclic dependencies are tolerated and resolved according to original order of elements in sequence.
    /// </summary>
    /// <typeparam name="T">The type of the elements of source.</typeparam>
    /// <param name="source">A sequence of values to order.</param>
    /// <param name="dependencyFunction">The dependency function.</param>
    /// <returns>The ordered sequence.</returns>
    public static IEnumerable<T> StableOrder<T>(
        IEnumerable<T> source,
        TopologicalDependencyFunction<T> dependencyFunction)
    {
        return StableOrder(source, dependencyFunction, EqualityComparer<T>.Default);
    }

    sealed class DependencyGraph<T>
    {
        private DependencyGraph()
        {
        }

        public IEqualityComparer<T> EqualityComparer
        {
            get;
            private set;
        }

        public sealed class Node
        {
            public int Position
            {
                get;
                set;
            }

            List<T> _Children = new List<T>();

            public IList<T> Children
            {
                get
                {
                    return _Children;
                }
            }
        }

        public IDictionary<T, Node> Nodes
        {
            get;
            private set;
        }

        public static DependencyGraph<T> TryCreate(
            IEnumerable<T> source,
            TopologicalDependencyFunction<T> dependencyFunction,
            IEqualityComparer<T> equalityComparer)
        {
            var list = source as IList<T>;
            if (list == null)
                list = source.ToArray();

            int n = list.Count;
            if (n < 2)
                return null;

            var graph = new DependencyGraph<T>();
            graph.EqualityComparer = equalityComparer;
            graph.Nodes = new Dictionary<T, Node>(n, equalityComparer);

            bool hasDependencies = false;

            for (int position = 0; position < n; ++position)
            {
                var element = list[position];

                Node node;
                if (!graph.Nodes.TryGetValue(element, out node))
                {
                    node = new Node();
                    node.Position = position;
                    graph.Nodes.Add(element, node);
                }

                foreach (var anotherElement in list)
                {
                    if (equalityComparer.Equals(element, anotherElement))
                        continue;

                    if (dependencyFunction(element, anotherElement))
                    {
                        node.Children.Add(anotherElement);
                        hasDependencies = true;
                    }
                }
            }

            if (!hasDependencies)
                return null;

            return graph;
        }

        public bool DoesXHaveDirectDependencyOnY(T x, T y)
        {
            Node node;
            if (Nodes.TryGetValue(x, out node))
            {
                if (node.Children.Contains(y, EqualityComparer))
                    return true;
            }
            return false;
        }

        sealed class DependencyTraverser
        {
            public DependencyTraverser(DependencyGraph<T> graph)
            {
                _Graph = graph;
                _VisitedNodes = new HashSet<T>(graph.EqualityComparer);
            }

            DependencyGraph<T> _Graph;
            HashSet<T> _VisitedNodes;

            public bool DoesXHaveTransientDependencyOnY(T x, T y)
            {
                if (!_VisitedNodes.Add(x))
                    return false;

                Node node;
                if (_Graph.Nodes.TryGetValue(x, out node))
                {
                    if (node.Children.Contains(y, _Graph.EqualityComparer))
                        return true;

                    foreach (var i in node.Children)
                    {
                        if (DoesXHaveTransientDependencyOnY(i, y))
                            return true;
                    }
                }

                return false;
            }
        }

        public bool DoesXHaveTransientDependencyOnY(T x, T y)
        {
            var traverser = new DependencyTraverser(this);
            return traverser.DoesXHaveTransientDependencyOnY(x, y);
        }
    }
}

And a small sample application:

class Program
{
    static bool DependencyFunction(char a, char b)
    {
        switch (a + " depends on " + b)
        {
            case "A depends on B":
                return true;

            case "B depends on D":
                return true;

            default:
                return false;
        }

    }

    static void Main(string[] args)
    {
        var source = "ABCDEF";
        var result = TopologicalSort.StableOrder(source.ToCharArray(), DependencyFunction);
        Console.WriteLine(string.Concat(result));
    }
}

Given the input elements {A, B, C, D, E, F} where A depends on B and B depends on D the output is {D, B, A, C, E, F}.

UPDATE: I wrote a small article about stable topological sort objective, algorithm and its proofing. Hope this gives more explanations and is useful to developers and researchers.



回答3:

You have insufficient criteria to specify what you're looking for. For instance consider a graph with two directed components.

1 -> 2 -> 3 -> 4 -> 5
6 -> 7 -> 8 -> 9 -> 0

Which of the following sorts would you prefer?

6, 7, 8, 9, 0, 1, 2, 3, 4, 5
1, 2, 3, 4, 5, 6, 7, 8, 9, 0

The first results from breaking all ties by putting the lowest node as close to the head of the list as possible. Thus 0 wins. The second results from trying to minimize the number of times that A < B and B appears before A in the topological sort. Both are reasonable answers. The second is probably more pleasing.

I can easily produce an algorithm for the first. To start, take the lowest node, and do a breadth-first search to locate the distance to the shortest root node. Should there be a tie, identify the set of nodes that could appear on such a shortest path. Take the lowest node in that set, and place the best possible path from it to a root, and then place the best possible path from the lowest node we started with to it. Search for the next lowest node that is not already in the topological sort, and continue.

Producing an algorithm for the more pleasing version seems much harder. See http://en.wikipedia.org/wiki/Feedback_arc_set for a related problem that strongly suggests that it is, in fact, NP-complete.



回答4:

Here's an easy iterative approach to topological sorting: continually remove a node with in-degree 0, along with its edges.

To achieve a stable version, just modify to: continually remove the smallest-index node with in-degree 0, along with its edges.

In pseudo-python:

# N is the number of nodes, labeled 0..N-1
# edges[i] is a list of nodes j, corresponding to edges (i, j)

inDegree = [0] * N
for i in range(N):
   for j in edges[i]:
      inDegree[j] += 1

# Now we maintain a "frontier" of in-degree 0 nodes.
# We take the smallest one until the frontier is exhausted.
# Note: You could use a priority queue / heap instead of a list,
#       giving O(NlogN) runtime. This naive implementation is
#       O(N^2) worst-case (when the order is very ambiguous).

frontier = []
for i in range(N):
    if inDegree[i] == 0:
        frontier.append(i)

order = []
while frontier:
    i = min(frontier)
    frontier.remove(i)
    for j in edges[i]:
       inDegree[j] -= 1
       if inDegree[j] == 0:
           frontier.append(j)

 # Done - order is now a list of the nodes in topological order,
 # with ties broken by original order in the list.


回答5:

Interpreting "stable topological sort" as a linearization of a DAG such that ranges in the linearization where the topological order doesn't matter, are sorted lexicographically. This can be solved with the DFS method of linearization, with the modification that nodes are visited in lexicographical order.

I have a Python Digraph class with a linearization method which looks like this:

def linearize_as_needed(self):
    if self.islinearized:
        return

    # Algorithm: DFS Topological sort
    # https://en.wikipedia.org/wiki/Topological_sorting#Depth-first_search

    temporary = set()
    permanent = set()

    L = [ ]

    def visit(vertices):
        for vertex in sorted(vertices, reverse=True):
            if vertex in permanent:
                pass
            elif vertex in temporary:
                raise NotADAG
            else:
                temporary.add(vertex)

                if vertex in self.arrows:
                    visit(self.arrows[vertex])

                L.append(vertex)

                temporary.remove(vertex)
                permanent.add(vertex)

        # print('visit: {} => {}'.format(vertices, L))

    visit(self.vertices)
    self._linear = list(reversed(L))
    self._iter = iter(self._linear)
    self.islinearized = True

Here

self.vertices

is the set of all vertices, and

self.arrows

holds the adjacency relation as a dict of left nodes to sets of right nodes.



回答6:

The depth-first search algorithm on Wikipedia worked for me:

const assert = chai.assert;

const stableTopologicalSort = ({
  edges,
  nodes
}) => {
  // https://en.wikipedia.org/wiki/Topological_sorting#Depth-first_search
  const result = [];
  const marks = new Map();

  const visit = node => {
    if (marks.get(node) !== `permanent`) {
      assert.notEqual(marks.get(node), `temporary`, `not a DAG`);
      marks.set(node, `temporary`);
      edges.filter(([, to]) => to === node).forEach(([from]) => visit(from));
      marks.set(node, `permanent`);
      result.push(node);
    }
  };

  nodes.forEach(visit);
  return result;
};

const graph = {
  edges: [
    [5, 11],
    [7, 11],
    [3, 8],
    [11, 2],
    [11, 9],
    [11, 10],
    [8, 9],
    [3, 10]
  ],
  nodes: [2, 3, 5, 7, 8, 9, 10, 11]
};

assert.deepEqual(stableTopologicalSort(graph), [5, 7, 11, 2, 3, 8, 9, 10]);
<script src="https://cdnjs.cloudflare.com/ajax/libs/chai/4.2.0/chai.min.js"></script>