I want to compute the Zhang-Shasha tree-edit distance between 2 trees (zss
library). However, my trees are in the form of networkx
graphs (they actually represent DOM html trees). The example in the zss documentation shows how to create a tree by hand:
from zss import *
A = (
Node("f")
.addkid(Node("a")
.addkid(Node("h"))
.addkid(Node("c")
.addkid(Node("l"))))
.addkid(Node("e"))
)
zss.simple_distance(A, A) # [0.0]
Which would be the same tree as:
import networkx as nx
G=nx.DiGraph()
G.add_edges_from([('f', 'a'), ('a', 'h'), ('a', 'c'), ('c', 'l'), ('f', 'e')])
so I would like to convert tree objects of networkx class into a zss
Node object, then compute the edit distance between 2 trees.
Thanks
(and do not hesitate to tell me if you think this is a XY problem)
Using
dfs_tree
can definitely help:In case we don't know which node is G's root node, but know we have a valid tree, we can get the source node by calling:
Since the root is the only node with no incoming nodes, that should work.