Partial subtree-matching algorithm

2019-04-02 07:43发布

问题:

I searched the forums but I couldn't figure it out if the other similar questions are necessarily related to this one.

What I'm trying to do is match a subtree to a tree of objects.

I know there are pattern matching algorithms based on suffix trees or automata, but I'm not sure whether they apply here.

I am trying to match the subtree given by the red nodes in the picture against the bigger tree, regardless of the overall structure of the tree or whether the red nodes have children or not.

The reason straightforward pattern matching does not work is that no ordering of the nodes (post/preorder, breadth) will be usable.

So I'm thinking of writing a recursive algorithm that starts from the root of the subtree and tries to match nodes and then their children.

I was wondering if there exists any such (efficient algorithm). Apologies if this has already been asked.

回答1:

Here are a few references that may help you:

Aho's Efficient Tree Pattern Matching: an Aid to Code Generation

Stanford's Tree pattern matching programs

From IMECS 2011: Tree Pattern Matching Algorithm Using a Succinct Data Structure



回答2:

It appears that what I was looking for was an algorithm to solve the "tree inclusion problem". I have found some useful documents:

Fast Algorithms for Finding Nearest Common Ancestors

The Tree Inclusion Problem: In Optimal Space and Faster

Tree Isomorphism And Related Problems

I translated one of the algorithms from the last paper into C# (returns the number of pairs in a largest matching between first-level subtrees of a and b).

public static int Match(Node a, Node b, NodeSimilarityComparer comp) {
  if (!comp.Equals(a, b))
    return 0;
  int m = a.SubtreeCount;
  int n = b.SubtreeCount;
  var matrix = new int[m + 1, n + 1];

  for (int i = 0; i <= m; ++i)
    matrix[i, 0] = 0;

  for (int j = 0; j <= n; ++j)
    matrix[0, j] = 0;

  for (int i = 1; i <= m; ++i)
    for (int j = 1; j <= n; ++j) {
      var ai = a.GetSubtree(i - 1);
      var bj = b.GetSubtree(j - 1);
      var match = Match(ai, bj, comp);
      matrix[i, j] = Math.Max(Math.Max(matrix[i, j - 1], matrix[i - 1, j]), matrix[i - 1, j - 1] + match);
    }
  return matrix[m, n] + 1;
}

Hope this helps others too.