Treap: Analysis
Operation costs:
find
is $O(h)$ where $h$ is the height of the tree (look-up in BST over keys, ignoring priorities).insert
is $O(h)$- at most $O(h)$ to find where the key must be inserted,
- at most $O(h)$ rotation to bring newly inserted node to the root (to fix heap order property)
remove
is $O(h)$- at most $O(h)$ to find the target,
- at most $O(h)$ rotation to bring the target to level $0$ (make it a leaf so it can easily be removed)
So what is the height of the treap?
- In the best case, it is $O(\lg n)$.
- In the worst case, it is $O(n)$.
- In the average case, it can be shown the height is $O(\lg n)$.
The original paper by Aragon and Siedel in 1989 showed that in a treap where the priorities are independently and uniformly distributed continuous random variables, the expected depth of any node is $O(\lg n)$. Consequently, it follows that the expected running time for any of the core operations is $O(\lg n)$.
The take-away: when inserting a new entry, generate a random real number between $0$ and $1$ and use that number as the priority of the new node. Using "real numbers" implies the probability of two nodes having the same priority is (almost) zero. In practice, we can use random integers from a large range, like $0$ to $2^{31} − 1$ (which is generated by Java's Random.nextInt
method), or random floating-point numbers.
In contrast to Treap, AVL tree guarantees $O(\lg n)$ height. However, there is no need to track nodes' height or calculate balance factors in a treap. Moreover, there are fewer rotation types, and the direction of rotations is easier to determine. In benchmark experiments, treap has shown to have good performance and often outperforms the AVL tree.