Exercise 12.3.5

Suppose that instead of each node $x$ keeping the attribute $x.p$ pointing to $x$'s parent, it keeps $x.succ$, pointing to $x$'s successor. Give pseudocode for SEARCH, INSERT, and DELETE on a binary search tree $T$ using this representation. These procedures should operate in time $\O(h)$, where $h$ is the height of the tree $T$. (Hint: You may wish to implement a subroutine that returns the parent of a node.)

I've got to admit, I was skeptical about this one. It was quite a good exercise, however, as it forces you to internalize a lot about binary search trees and their predecessor/successor invariants.

The pseudocode and the C code differ a bit, as the C code is a bit more complicated to avoid a few extra steps. It even can be optimized a bit further, but it's not worth the complexity. The asymptotic bounds are going to be the same as the pseudocode

Some thoughts

I pondered a bit about why it would be useful to keep track of the successor instead of the parent. I may be missing something, but I can find a single advantage – the tree can be walked inorder without extra space and a bit more optimally. Another benefit is that it simplifies finding the successor of the deleted node in DELETE. Apart from that, not keeping track of the parent introduces some overhead, especially when deleting.

Notable differences

The pseudocode needs two change in two significant ways:

Pseudocode

SEARCH remains unchanged:

TREE-SEARCH(x, k)
    if x = NIL or k == x.key
        return x
    if k < x.key
        return TREE-SEARCH(x.left, k)
    else
        return TREE-SEARCH(x.right, k)

INSERT needs to allocate the correct successor of the newly inserted node and update it's predecessor. Since we're only inserting leaf nodes, both predecessor and successor are findable through their invariants:

The final one, DELETE is a bit more complicated. It needs TREE-PARENT, that finds the parent of a node, starting from the root, and TREE-PREDECESSOR that finds finds the predecessor of a node, starting from the root. It also needs a modified version of TRANSPLANT. Since those are pretty self-explanatory, let's start with DELETE and explore them afterwards.

It's worth noting that we don't need to call TREE-MINIMUM on z.right in the two-children case, as we already know its successor.

TREE-DELETE(T, z)
    pred = TREE-PREDECESSOR(z)

    if z.left == NIL
        TRANSPLANT(T, z, z.right)
    else if z.right == NIL
        TRANSPLANT(T, z, z.left)
    else
        y = z.succ

        if (z.right != y)
            TRANSPLANT(T, y, y.right)
            y.right = z.right

        TRANSPLANT(T, z, y)
        y.left = z.left

    if pred != NIL
        pred.succ = z.succ

Node that y.p != z needs to be changed to z.right != y, which means the same thing.

TRANSPLANT now needs to explicitly find the parent:

TRANSPLANT(T, u, v)
    p = TREE-PARENT(T, u)

    if p == NIL
        T.root = v
    else if u == p.left
        p.left = v
    else
        p.right = v

The notable change here is that we don't need to update the v's parent, as we don't keep track of it.

Finally, TREE-PREDECESSOR is fairly needs to start from the root, instead of the node:

TREE-PREDECESSOR(T, x)
    y = NIL
    z = T.root

    while z != x
        if z.key < x.key
            y = z
            z = z.right
        else
            z = z.left

    z = z.left

    while z != NIL
        y = z
        z = z.right

    return y

The first loop traverses the tree from the root to x, keeping track of the last right branch to mark it as a potential predecessor. When it finishes, z equals x. If there is no left child, we're done, and the ancestor is the predecessor. If there is a left child, we follow it and keep going left, until there are no no mode left children to follow. The last is essentially what TREE-MINIMUM does. We could change the order of the loops (find the minimum of the left child and go through the root only if it doesn't exist). While it will improve the constant, it's asymptotically the same.

The C code merges TREE-PREDECESSOR and TREE-PARENT into one.

It's pretty easy to illustrate that the result is $\O(h)$, so I won't bother.


C code

#include <stdlib.h>

struct node_t {
    struct node_t *left;
    struct node_t *right;
    struct node_t *succ;
    int key;
};
typedef struct node_t node_t;

typedef struct {
    node_t *root;
} tree_t;

tree_t *make_tree() {
    tree_t *tree = malloc(sizeof(tree_t));
    tree->root = NULL;
    return tree;
}

node_t *make_node(int key) {
    node_t *node = malloc(sizeof(node_t));

    node->succ = NULL;
    node->left = NULL;
    node->right = NULL;
    node->key = key;

    return node;
}

node_t *insert(tree_t *tree, int key) {
    node_t *parent = NULL;
    node_t *current = tree->root;
    node_t *new = make_node(key);
    node_t *pred = NULL;

    while (current) {
        parent = current;
        if (new->key < current->key) {
            new->succ = current;
            current = current->left;
        } else {
            pred = current;
            current = current->right;
        }
    }

    if (!parent) {
        tree->root = new;
    } else if (new->key < parent->key) {
        parent->left = new;
    } else {
        parent->right = new;
        pred = parent;
    }

    if (pred) pred->succ = new;

    return new;
}

node_t *find_parent(tree_t *tree, node_t *node) {
    node_t *previous = NULL;
    node_t *current = tree->root;

    while (current && current->key != node->key) {
        if (current->key < node->key) {
            previous = current;
            current = current->right;
        } else {
            previous = current;
            current = current->left;
        }
    }

    return previous;
}

void find_parent_and_predecessor(tree_t *tree, node_t *node, node_t **parent, node_t **predecessor) {
    *parent = NULL;
    *predecessor = NULL;
    node_t *current = tree->root;

    while (current && current->key != node->key) {
        if (current->key < node->key) {
            *parent = current;
            *predecessor = current;
            current = current->right;
        } else {
            *parent = current;
            current = current->left;
        }
    }

    if (!current) return;

    current = current->left;

    while (current) {
        *predecessor = current;
        current = current->right;
    }
}

void transplant(tree_t *tree, node_t *parent, node_t *target, node_t *source) {
    if (!parent) {
        tree->root = source;
    } else if (target == parent->left) {
        parent->left = source;
    } else {
        parent->right = source;
    }
}

void delete_tree(tree_t *tree, node_t *node) {
    node_t *parent;
    node_t *predecessor;

    find_parent_and_predecessor(tree, node, &parent, &predecessor);

    if (!node->left) {
        transplant(tree, parent, node, node->right);
    } else if (!node->right) {
        transplant(tree, parent, node, node->left);
    } else {
        node_t *successor = node->succ;

        if (node->right != successor) {
            node_t *sparent = find_parent(tree, successor);
            transplant(tree, sparent, successor, successor->right);
            successor->right = node->right;
        }

        transplant(tree, parent, node, successor);
        successor->left = node->left;
    }

    if (predecessor) predecessor->succ = node->succ;
}

node_t *search(tree_t *tree, int key) {
    node_t *node = tree->root;

    while (node) {
        if (node->key == key) {
            return node;
        } else if (node->key < key) {
            node = node->right;
        } else {
            node = node->left;
        }
    }

    return NULL;
}

typedef void (*callback_t)(node_t *);

void inorder_walk(node_t *node, callback_t callback) {
    if (!node) return;
    inorder_walk(node->left, callback);
    callback(node);
    inorder_walk(node->right, callback);
}

void successor_walk(node_t *node, callback_t callback) {
    while (node->left) node = node->left;

    while (node) {
        callback(node);
        node = node->succ;
    }
}