Author: Hayden Smith 2021
Why?
What?
When you insert into a BST, there are two extreme cases:
In reality, the average case is a mix between these too. This is still a search cost of O(log(n)), though the real cost is more than the best case
The formal definition for a weight-balanced BST is a tree that, for all nodes:
The formal definition for a height-balanced BST is a tree that, for all nodes:
There are 4 key operations we will look at that help us balance
Left rotation
Move right child to node, then rearrange links to retain order
Right rotation
Move left child to node, then rearrange links to retain order
Insertion at root
Each new item added at root node. (Added at leaf then rotated up)
Partition
Rearrange tree around specified node (push to root)
5
3
4
2
6
5
3
4
2
6
Rotate (5) right
5
3
4
2
6
1
2
3
5
3
4
2
6
5
3
4
2
6
Rotate (3) left
1
2
3
5
3
4
2
6
Rotate (5) right
Rotate (3) left
5
3
4
6
5
3
4
2
6
2
Formally the process for rotating is (for right rotation):
rotateRight(root):
newRoot = (root's left child)
if (root) is empty or (newRoot) is empty:
return root
(Root's left child) should now point to (newRoot's right child)
(newRoot's right child) should now point to root
return newRootrotateLeft(root):
newRoot = (root's right child)
if (root) is empty or (newRoot) is empty:
return root
(Root's right child) should now point to (newRoot's left child)
(newRoot's left child) should now point to root
return newRootApproach: Insert new items at the root
10
5
14
30
32
29
24
10
5
14
30
32
29
24 as a leaf
10
5
14
30
32
29
24
Rotate 29 right
10
5
14
30
32
29
24
10
5
14
30
32
29
24
Rotate 30 right
10
5
14
30
32
29
24
10
5
14
30
32
29
24
Rotate 14 left
10
5
14
30
32
29
24
10
5
14
30
32
29
24
Rotate 10 left
10
5
14
30
32
29
24
Pseudocode
insertAtRoot(tree, item):
root = get root of "tree"
if tree is empty:
tree = new node containing item
else if item < root's value:
(tree's left child) = insertAtRoot((tree's left child), item)
tree = rotateRight(tree)
else if item > root's value:
(tree's right child) = insertAtRoot((tree's right child), item)
tree = rotateLeft(tree)
return treepartition(tree, index)
Rearranges tree so that element with index i becomes root. Indexes are based on in-order.
If we partition the middle index, quite often we'll get something more balanced.
Pseudocode
Pseudocode
partition(tree, index)
m = # nodes in the tree's left subtree
if index < m:
(tree's left subtree) = partition(tree's left subtree, index)
tree = rotateRight(tree)
else if index > m:
(tree's right subtree) = partition(tree's right subtree, index - m - 1)
tree = rotateLeft(tree)
return treeTime complexity similar to insert-at-root. Gets faster the more we partition closer to the root.
A simple approach to using partition selectively.
Let's rebalance the tree every k inserts
periodicRebalance(tree, index)
tree = insertAtLeaf(tree, item)
if #nodes in tree % k == 0:
tree = partition(tree, middle index)
return treeThe problem? To find the #nodes in a tree we need to do a O(n) traversal which can be expensive to do often.
Is is going to be "better" to store the nodes in any given subtree within the node itself? Yes? No?
typedef struct Node {
int data;
int nnodes; // #nodes in my tree
Tree left, right; // subtrees
} Node;