Problem 13.4

Treaps

If we insert a set of nn items into a binary search tree, the resulting tree may be horribly unbalanced, leading to long search times. As we saw in Section 12.4, however, randomly built binary search trees tend to be balanced. Therefore, one strategy that, on average, builds a balanced tree for a fixed set of items would be to randomly permute the items and then insert them in that order into the tree.

What if we do not have all the items at once? If we receive the items one at a time, can we still randomly build a binary search tree out of them?

We will examine a data structure that answers this question in the affirmative. A treap is a binary search tree with a modified way of ordering the nodes. Figure 13.9 shows an example. As usual, each node xx in the tree has a key value x.keyx.key. In addition, we assign x.priorityx.priority, which is a random number chosen independently for each node. The nodes of the treap are ordered so that the keys obey the binary-search-tree property and the priorities obey the min-heap property:

  • If vv is a left child of uu, then v.key<u.keyv.key < u.key.
  • If vv is a right child of uu, then v.key>u.keyv.key > u.key.
  • If vv is a child of uu, then v.priority>u.priorityv.priority > u.priority.

(This combination of properties is why the tree is called a "treap": it has features of both a binary search tree and a heap.)

It helps to think of treaps in the following way. Suppose that we insert nodes x1,x2,,xnx_1, x_2, \ldots, x_n, with associated keys, into a treap. Then the resulting treap is the tree that would have been formed if the nodes had been inserted into a normal binary search tree in the order given by their (randomly chosen) priorities, i.e., xi.priority<xj.priorityx_i.priority < x_j.priority means that we had inserted xix_i before xjx_j.

  1. Show that given a set of nodes x1,x2,,xnx_1, x_2, \ldots, x_n, with associated keys and priorities, all distinct, the treap associated with these nodes is unique.
  2. Show that the expected height of a treap is Θ(lgn)\Theta(\lg n), and hence the expected time to search for a value in the treap is Θ(lgn)\Theta(\lg n).

Let use see how to insert a new node into an existing treap. The first thing we do is assign the new node a random priority. Then we call the insertion algorithm, which we call TREAP-INSERT, whose operation is illustrated in Figure 13.10.

  1. Explain how TREAP-INSERT works. Explan the idea in English and give pseudocode. (Hint: Execute the usual binary-search-tree insertion procedure and them perform rotations to restore the min-heap order property.)
  2. Show that the expected running time of TREAP-INSERT is Θ(lgn)\Theta(\lg n).

TREAP-INSERT performs a search and then a sequence of rotations. Although these two operations have the same expected running time, they have different costs in practice. A search reads information from the treap without modifying it. In contrast, a rotation changes parent and child pointers within the treap. On most computers, read operations are much faster than write operations. Thus we would like TREAP-INSERT to perform few rotations. We will show that the expected number of rotations performed is bounded by a constant.

In order to do so, we will need some definitions, which Figure 13.11 depicts. The left spine of a binary search tree TT is the simple path from the root to the node with the smallest key. In other words, the left spine is the simple path from the root that consists of only left edges. Symmetrically, the right spine of TT is the simple path from the root consisting of only right edges. The length of a spine is the number of nodes it contains.

  1. Consider the treap TT immediately after TREAP-INSERT has inserted node xx. Let CC be the length of the right spine of the left subtree of xx. Let DD be the length of the left spine of the right subtree of xx. Prove that the total number of rotations that were performed during the insertion of xx is equal to C+DC + D.

We will now calculate the expected values for CC and DD. Without loss of generality, we assume that the keys are 1,2,,n1, 2, \ldots, n, since we are comparing them only to one another.

For nodes xx and yy in treap TT, where yxy \ne x, let k=x.keyk = x.key and i=y.keyi = y.key. We define indicator random variables:

Xik=I{y is in the right spine of the left subtree of x} X_{ik} = I\{y \text{ is in the right spine of the left subtree of } x\}

  1. Show that Xik=1X_{ik} = 1 if and only if y.priority>x.priorityy.priority > x.priority, y.key<x.keyy.key < x.key, and, for every zz such that y.key<z.key<x.keyy.key < z.key < x.key, we have y.priority<z.priorityy.priority < z.priority.
  2. Show that Pr{Xik=1}=(ki1)!(ki+1)!=1(ki+1)(ki) \begin{aligned} \Pr\{X_{ik} = 1\} &= \frac{(k - i - 1)!}{(k - i + 1)!} \\\\ &= \frac{1}{(k - i + 1)(k - i)} \end{aligned}
  3. Show that E[C]=j=1k11j(j+1)=11k \E[C] = \sum_{j=1}^{k-1} \frac{1}{j(j + 1)} = 1 - \frac{1}{k}
  4. Use a symmetry argument to show that E[D]=11nk+1 \E[D] = 1 - \frac{1}{n - k + 1}
  5. Conclude that the expected number of rotations performed when inserting a node into a treap is less than 2.

a. unique treaps

If we take any tree elements, and attempt to arrange them in a treap

Or to put it another way:

  1. the element with the lowest priority will be the root
  2. the elements with smaller keys will be in the left subtree
  3. the elements with larger keys will be in the right subtree
  4. there is only one unique way to pick the root, and the elements of each subtree
  5. this argument can be applied recursively for each subtree

b. expected height

Taking the easy way out, we already know from the text that the expected height of a randomly built search tree is Θ(lgn)\Theta(\lg n). The argument from Chapter 12.4 applies here full force.

c. explain TREAP-INSERT

It first inserts the node in the tree by finding a leaf position to put it in. After that, it compares the priority with that of the parent. If it's higher, we're done. If it's lower, we need to perform a rotation to put the newly inserted element in the place of the parent and then continue with the rest of the ancestors.

Instead of pseudocode, there is a Python implementation below.

d. expected running time

The algorithm is linear to the height of the tree (it performs at most a single rotation for each ancestor), which is Θ(lgn)\Theta(\lg n).

e. number of rotations after insertion

Let ll be the right spine of the left subtree and rr be the left spine of the right subtree.

Observe that:

  1. Both ll and rr start with zero elements.
  2. Each rotation add a single element to rr and ll. Specifically, left rotations increase ll with one element (the parent) and right rotations increase rr with one element (the parent)

f. when is a node in the right spine of the left subtree

Necessity:

If yy is in the right spine of the left subtree of xx, then:

Sufficiency:

If the following tree conditions hold:

  1. y.priority>x.priorityy.priority > x.priority
  2. y.key<x.keyy.key < x.key
  3. for every zz such that y.key<z.key<x.keyy.key < z.key < x.key: y.priority<z.priorityy.priority < z.priority

We know that yy is in the left subtree of xx (because of (1) and (2)), and all it's ancestors up until xx are smaller than it (because of (3)), and therefore it is the right child of its parent, which is the right child of its parent, and so on.

g. probability of Xik=1X_{ik} = 1

We leverage the knowledge of the previous point, and then we apply some counting.

We already know that i<ki < k, but there may be some elements between ii and kk. From (f) we know that (1) their priority needs to be larger than that of either xx or yy, and that xx have to have the smaller priority than yy.

There are in total ki+1k - i + 1 elements to consider. We can model the probability of their priority by arranging them in order from the smallest to the largest priority. There are thus (ki+1)!(k - i + 1)! possible ways to arrange the priorities (this is the denominator). For the ones that satisfy the conditions in (f), we are looking for xx being the first, yy being the second, and (ki1)!(k - i - 1)! possible ways to arrange the rest (the numerator).

h. expectation of CC

So, the expectation E[C]\E[C] is X12+X13++Xk1,kX_{12} + X_{13} + \ldots + X_{k-1,k}, that is:

E[C]=i=1k11(ki+1)(ki)=j=1k11j(j+1)(let j=ki)=j=1k1(1j(j+1)jj(j+1)+jj(j+1))=j=1k1(j+1j(j+1)jj(j+1))=j=1k1(1j1j+1)=j=1k11jj=1k11j+1=1+j=2k11jj=1k21j+11k1+1(by taking one element out each sum)=1+j=2k11jj=2k11j1k(by letting j=j+1 in the second sum)=11k(by letting the sums cancel each other out) \begin{aligned} \E[C] &= \sum_{i=1}^{k-1} \frac{1}{(k - i + 1)(k - i)} \\ &= \sum_{j=1}^{k-1} \frac{1}{j(j+1)} && \text{(let } j = k - i \text{)} \\ &= \sum_{j=1}^{k-1} \left( \frac{1}{j(j+1)} - \frac{j}{j(j+1)} + \frac{j}{j(j+1)} \right) \\ &= \sum_{j=1}^{k-1} \left( \frac{j + 1}{j(j+1)} - \frac{j}{j(j+1)} \right) \\ &= \sum_{j=1}^{k-1} \left( \frac{1}{j} - \frac{1}{j+1} \right) \\ &= \sum_{j=1}^{k-1} \frac{1}{j} - \sum_{j=1}^{k-1} \frac{1}{j+1} \\ &= 1 + \sum_{j=2}^{k-1} \frac{1}{j} - \sum_{j=1}^{k-2} \frac{1}{j+1} - \frac{1}{k - 1 + 1} && \text{(by taking one element out each sum)} \\ &= 1 + \sum_{j=2}^{k-1} \frac{1}{j} - \sum_{j=2}^{k-1} \frac{1}{j} - \frac{1}{k} && \text{(by letting } j = j + 1 \text{ in the second sum)} \\ &= 1 - \frac{1}{k} && \text{(by letting the sums cancel each other out)} \end{aligned}

i. symmetry for E[D]\E[D]

The same approach holds here, symmetrically. The only thing we need to consider, is that instead of looking at 1,2,,k1, 2, \ldots, k, we have to look at k,k+1,,nk, k + 1, \ldots, n. There are nk+1n - k + 1 elements in that sequence, and if we apply the same math, we get the expectation:

E[D]=11nk+1 \E[D] = 1 - \frac{1}{n - k + 1}

j. conclusion

The expected number of rotations, is then:

E[C+D]=E[C]+E[D]=11k+11nk1<2=O(1) \E[C + D] = \E[C] + \E[D] = 1 - \frac{1}{k} + 1 - \frac{1}{n - k - 1} < 2 = \O(1)


Python code

from collections import deque


class Node:
    def __init__(self, key, priority, left=None, right=None):
        self.key = key
        self.priority = priority
        self.left = left
        self.right = right

    def __str__(self):
        def dump(node):
            if not node:
                return "NIL"
            else:
                return f"{node.key}:{node.priority}({dump(node.left)}, {dump(node.right)})"

        return dump(self)

    def left_rotate(self):
        child = self.right
        assert(child)
        self.right = child.left
        child.left = self

        return child

    def right_rotate(self):
        child = self.left
        assert(child)
        self.left = child.right
        child.right = self

        return child

    __repr__ = __str__


class Treap:
    def __init__(self):
        self.root = None

    def __str__(self):
        if not self.root:
            return "NIL"
        else:
            return str(self.root)

    __repr__ = __str__

    def nodes(self):
        if not self.root:
            return

        remaining = deque()
        remaining.append(self.root)

        while remaining:
            node = remaining.popleft()
            if node.left:
                remaining.append(node.left)
            if node.right:
                remaining.append(node.right)
            yield node

    def insert(self, key, priority=None):
        def insert_node(subtree, node):
            if not subtree:
                return node
            elif node.key < subtree.key:
                subtree.left = insert_node(subtree.left, node)
                if subtree.priority > subtree.left.priority:
                    subtree = subtree.right_rotate()
                return subtree
            else:
                subtree.right = insert_node(subtree.right, node)
                if subtree.priority > subtree.right.priority:
                    subtree = subtree.left_rotate()
                return subtree

        new = Node(key, priority=priority)
        self.root = insert_node(self.root, new)

    def search(self, key):
        node = self.root

        while node:
            if node.key == key:
                return node
            elif key < node.key:
                node = node.left
            else:
                node = node.right

        return None