How does this BST node-deletion algorithm work?

2019-07-13 11:37发布

问题:

I'm trying to follow the BST algorithm in "Data Structures and Algorithms" by Granville Barnett, but I don't understand the node-deletion algorithm it describes below.

Section 3.3 (p. 22)

Removing a node from a BST is fairly straightforward, with four cases to consider:

  1. the value to remove is a leaf node; or
  2. the value to remove has a right subtree, but no left subtree; or
  3. the value to remove has a left subtree, but no right subtree; or
  4. the value to remove has both a left and right subtree in which case we promote the largest value in the left subtree.

Figure 3.2 (p. 22)

    23
   /  \
  14   31
 /
7
 \
  9
  • Case #1 points to node 9.
  • Case #2 points to node 7.
  • Case #3 points to node 14.
  • Case #4 points to node 23.

I interpret the text above for #4 to mean that when we remove 23, we promote 14 to root and make 31 its right child:

  14
 /  \
7   31
 \
  9

...but the book's algorithm (from p. 23) for case #4 bamboozles me (I've rewritten it here in Java):

1 boolean remove(T value) {
2   // ...
3
4   // case #4
5   Node largestValueNode = nodeToRemove.left;
6   while (largestValueNode.right != null) {
7       // find the largest value in the left subtree of nodeToRemove
8       largestValueNode = largestValueNode.right;
9   }
10  
11  // delete the right child of largestValueNode's parent
12  findParent(largestValueNode.value).right = null;
13  nodeToRemove.value = largestValueNode.value;
14  
15  count--;
16  return true; // successful
17}

If I follow the algorithm, the largestValueNode is node 14, so its parent is node 23. Why does the algorithm nullify the right child of the parent?

Why does line 13 copy the largestValueNode's value into the node to be deleted?

I would've expected lines 11-13 to be:

11  if (largestValueNode != null)
12      largestValueNode.right = nodeToRemove.right;
13  nodeToRemove.right = null;

EDIT:

The book's algorithm indeed has a bug. The fix is below:

1 boolean remove(T value) {
2   // ...
3
4   // case #4
5   Node largestValueNode = nodeToRemove.left;
6   while (largestValueNode.right != null) {
7       // find the largest value in the left subtree of nodeToRemove
8       largestValueNode = largestValueNode.right;
9   }
10  
11  Node p = findParent(largestValueNode.value);
12  if (p != null) {
13      if (nodeToRemove == p)
14          nodeToRemove.left = largestValueNode.left;
15      else
16          p.right = largestValueNode.left;
17  }
18  nodeToRemove.value = largestValueNode.value;
19  
20  count--;
21  return true; // successful
22}

回答1:

if you do this

11  if (largestValueNode != null)
12      largestValueNode.right = nodeToRemove.right;
13  nodeToRemove.right = null;

you're not considering the case where 14 might have a right child. For example:

     23
    / \
   14  31
  / \
 7   15
  \
   9

You're solution when removing 23 should be

     15
    / \
   14  31
  / 
 7  
  \
   9

So you're setting the right child of 15's original parent, 14 to null. This is what the first code is doing.

Edit: Addressing your comment

With your solution, you'll get

     23
    / 
   14  
  / \
 7   15
  \   \
   9   31

Also, the original code is also wrong; try something like this:

if(nodeToRemove == findParent(largestValueNode.value))
   nodeToRemove.left = largestValueNode.left
else
   findParent(largestValueNode.value).right = largestValueNode.left
nodeToRemove.value = largestValueNode.value

Also to answer, "Why does line 13 copy the largestValueNode's value into the node to be deleted?"

We're deleting largestValueNode, before which we're storing it's value in the nodeToRemove



回答2:

Seems like the book's algorithm is wrong for this particular example (assuming you have translated to Java perfectly :)). It is doing what you have mentioned but it is right for the case:

where nodeToRemove = 23 and in your BST 14 had a right child 15. The book's algorithm would replace 23 with 15 here and set 14's right child to null. Your algorithm would fail in this case.



回答3:

Take a careful look at the line:

largestValueNode.right = nodeToRemove.right;

Note how this line causes 14 to look like this (ignoring grandchildren):

  14
 /  \
7   31

But this is exactly what is desired! Since 14 now has 31 as its right child, it's no longer correct for 31 to be the right child of 15, so for clean-up's sake, the right child of 15 is set to NULL.



回答4:

Good to know the original code is wrong - I've just spent a few hours on it and was thinking the whole time that I was missing something. There's an NPE problem if the root element is passed, and anyway root element removal is not factored in.

Here's my Java implementation that could probably use some optimization - suggestions welcome. O (n log n) worst case. Tests below.

public boolean remove(final T value0) {
    BinarySearchTreeNode<T> target = findNode(value0);

        // Node DNE
        if (target == null) {
            return false;
        }

        // Both children populated, no need for parent
        if (target.right != null && target.left != null) {
            BinarySearchTreeNode<T> max = maxChild(target.left);
            findParent(max.value).right = null;
            target.value = max.value;
        }
        // Root element targeted, parent DNE
        else if (target == root) {
            if (target.right == null && target.left == null) {
                root = null;
            }
            else if (target.right == null) {
                root = target.left;
            }
        else {
            root = target.right;
        }
    }
    // Non-root, single-child node - find if L or R child, update parent reference.
    else {
        BinarySearchTreeNode<T> parent = findParent(value0);

        if (target.right == null && target.left != null) {
            if (target.value.compareTo(parent.value) < 0) {
                parent.left = target.left;
            }
            else {
                parent.right = target.left;
            }
        }
        else if (target.right != null && target.left == null) {
            if (target.value.compareTo(parent.value) < 0) {
                parent.left = target.right;
            }
            else {
                parent.right = target.right;
            }
        }
    }       

    return true;
}

Unit tests (all pass, obviously):

package BinarySearchTreeTests;

import static org.junit.Assert.assertEquals;
import static org.junit.Assert.assertFalse;
import static org.junit.Assert.assertNull;
import static org.junit.Assert.assertTrue;

import org.junit.Before;
import org.junit.Test;

public class Remove {
    BinarySearchTree<Integer> tree;

@Before
public void setUp() {
    tree = new BinarySearchTree<Integer>();
}

@Test
public void fromEmptyTree() {
    assertFalse(tree.remove(8));
}

@Test
public void fromTreeWithOnlyRootNode() {
    tree.add(10);
    assertTrue(tree.remove(10));
    assertNull(tree.root);
}

@Test
public void nonexistentElement() {
    tree.add(10);
    assertFalse(tree.remove(8));
}

/**
 *     N
 * 10--|
 *     |  6
 *     5--|
 *        3
 */
@Test
public void nodeWithNoRightChildren() {
    tree.add(10);
    tree.add(5);
    tree.add(6);
    tree.add(3);
    tree.remove(10);
    assertEquals(tree.root.value, Integer.valueOf(5));
    assertEquals(tree.root.left.value, Integer.valueOf(3));
    assertEquals(tree.root.right.value, Integer.valueOf(6));
}

/**
 *         17
 *     15--|
 *     |   13
 * 10--|
 *     N
 */
@Test
public void nodeWithNoLeftChildren() {
    tree.add(10);
    tree.add(15);
    tree.add(17);
    tree.add(13);
    tree.remove(10);
    assertEquals(tree.root.value, Integer.valueOf(15));
    assertEquals(tree.root.left.value, Integer.valueOf(13));
    assertEquals(tree.root.right.value, Integer.valueOf(17));
}

/**
 *           19
 *        17-|
 *        |  16
 *     15-|
 *     |  |  14
 *     |  13-|
 *     |     12
 * 10--|
 *     N
 */       
@Test
public void nodeWithLeftAndRightChildren() {
    tree.add(10);
    tree.add(15);
    tree.add(17);
    tree.add(13);
    tree.add(19);
    tree.add(16);
    tree.add(14);
    tree.add(12);

    tree.remove(15);
    assertEquals(tree.root.right.value, Integer.valueOf(14));
    assertNull(tree.root.right.left.right);
}

/**
 *           18
 *        15-|
 *        |  [ALWAYS EMPTY]
 *     15-|
 *     |  |  13
 *     |  12-|
 *     |     11
 * 10--|
 *     N
 * 
@Test
public void removeDuplicate() {
    Above diagram shows duplicate cases are already tested implicitly.
    fail();
} */
}