Algorithm to clone a tree is quite easy, we can do pre-order traversal for that. Is there an efficient algorithm to clone a graph?
I tried a similar approach, and concluded we need to maintain a hash-map of nodes already added in the new graph, else there will be duplication of nodes, since one node can have many parents.
As you copy nodes from the source graph, place each node in a
<Node, Integer>
map (thus numbering the nodes from 1 to N).As you paste nodes in the destination graph, place each node in a
<Integer, Node>
map (again numbering from 1 to N, but with mapping in reverse direction for reasons that will be clear soon).When you find a backlink in the source graph, you can map that backlinked "source copy" node to an integer
i
using the first map. Next, you use the second map to lookup the integeri
and find the "destination copy" node which needs to have the same backlink created.To make the cloning efficient, use two things:
list
,queue
orstack
Map
can be quickly searched i.e.O(1)
amortized.Algorithm:
To be processed --> Queue
Queue -> To be processed
.Your hash map approach seems viable if you had some quick way of uniquely identifying every node.
Otherwise, you would be best off if you:
used data-structure to store graph that allows you to store "is duplicated" unique flag directly in every node (for example,
0
not duplicated,1
tonumberOfNodes
for duplicated nodes),traversed nodes (via BFS, DFS) taking care of the already duplicated nodes and reconstructing all connections from the starting graph.
Both your starting graph and finishing graph should have corresponding unique "is duplicated" flag in every node, so you are able to reconstruct connections from starting graph properly. Of course, you could use "is duplicated" flag and "unique identifier" as separate variables.
It suffices to do a depth first search and copy each node as it's visited. As you say, you need some way of mapping nodes in the original graph to corresponding copies so that copies of cycle and cross edges can be connected correctly.
This map also suffices to remember nodes already visited in the DFS.
So in Java you'd have something like:
Note that you don't want to override
equals
orhashCode
forNode
. The map containing copies relies on reference equality as defined byObject
.A different approach is to put an additional field in each node that points to the copy if there is one and is null otherwise. This merely implements the map by another method. But it requires two passes over the graph: one to make the copy and another to clear these map fields for future re-use.