Efficient algorithm for merging two DAGs

2019-03-15 10:42发布

问题:

I have two weighted DAGs (directed acyclic graphs) and need to merge them into one, so I can get a topological ordering (it could be more than two in some cases). The problem is that the graphs are acyclic each, but can form a cycle together. Also, the graphs are large (100k+ nodes, 500k+ edges). Is there a clever way to merge the graphs? Equally good would be an algorithm to traverse all graphs "at once".

Edit:

By "merge" I mean combining all edges and vertices of both graphs together (preserving weights of course), if they do not create cycles. If an edge already exists I want to use the greater weight for it.

The idea is that starting with two acyclic graphs should give an advantage over simply "fixing" the result afterwards (this would imply to find the feedback arc set which is NP hard so I wanted to avoid that).

Thank you for your suggestions.

回答1:

One issue is that there is may be more than one solution.

Consider for instance the DAGs {(a,b),(a,c)} and {(b,a),(b,c)}. You can "merge" these in two different ways:

  1. {(a,b),(a,c),(b,c)}
  2. {(a,c),(b,a),(b,c)}

The number of solutions grows combinatorially with the number of cycles in the union of the two graphs, so for your big graphs there is probably a huge number of ways you can "merge" them.

Do you have a criterion which could help you decide which DAG is "better" than another?



回答2:

Assuming the vertices are the same for both the graphs if not, just consider

V = V1 U V1

Lets assume you've got an adjacency list. Now for every vertex v in V1 and V2, you can sort the adjacency list by the vertex the edge leads to (if it's (vertex, weight) pair, sort by vertex). This shouldn't be so expensive since the graph is small, and it would be summation degree(v)*log(degree(v)) which should not be so bad.

After this you can iterate for all vertices v in V1 and V2, and do a merge sort of the adjacency lists of v in V1 and V2. This is similar to finding union of 2 sorted arrays using merge sort, only that where you'll find an element occurring in both you can choose the larger edge.



回答3:

I had a similar problem that I solved like so:

I transformed my DAG into a map with a map of nodes (keyed by node id, value a collection of nodes, probably one to start) and a map of edges (keyed by source, target pair, value is a collection of edges, probably one to start). I called this normalize. The original graph was a collection of "roots" each node had a collection of children

I could then merge two of these together by merging nodes by key and edges by key. ie: if a node did exist then cons the new node to the existing node value, if a node does not exist, then add it. same with the edges.

This worked pretty clean but it didn't avoid cycles.

Here is some code (clojure) that I used:

(def empty-graph
   {:nodes {}
    :edges {}})

(defn merge-graphs
  [a b]
  {:nodes (merge-with concat (get a :nodes) (get b :nodes))
   :edges (merge-with concat (get a :edges) (get b :edges))})

(defn normalize-graph
  [graph]
  {:nodes (->>
            graph
            (mapcat
              (fn [root]
                (->>
                  root
                  (tree-seq (fn [_] true) (fn [node] (get node :children)))
                  (mapcat
                    (fn [edge]
                      (concat
                        (if-let [source (get edge "source")]
                          [[source [source]]]
                          [])
                        (if-let [target (get edge "target")]
                          [[target [target]]]
                          [])))))))
            (into {}))
   :edges (->>
            graph
            (mapcat
              (fn [root]
                (->>
                  root
                  (tree-seq (fn [_] true) (fn [node] (get node :children)))
                  (filter (fn [edge] (and (not (nil? (get edge "source"))) (not (nil? (get edge "target")))))) ;this sucks but is necessary
                  (map
                    (fn [edge]
                      (let [compact (dissoc edge :children)]
                        [[(get edge "source") (get edge "target")] [compact]]))))))
            (into {}))})