Author: Hayden Smith 2021
Why?
What?
Source: https://searchstorage.techtarget.com/definition/file-system
Binary search trees (BSTs) are able to be searched with a binary search, and are easy to maintain / modify
Let's establish a few key facts:
BSTs provide us the best of both worlds:
BSTs are ordered trees, which means:
Binary search trees are either:
Two key concepts with tree structures:
A tree becomes weight-balanced once there are equal number of nodes between the left and right subtree, for all nodes in the tree.
However, there are many more operations for BSTs
This BST is initially empty, then we insert [3, 2, 4, 5, 1] in that order.
Insert does not guarantee to maintain a balanced tree.
So what kind of algorithm is this actually using?
TreeInsert(Tree, item):
if Tree is empty:
return new root node containing item
else if item < Tree's node value:
Tree's left child = TreeInsert(Tree's left child, item)
else if item > Tree's node value:
Tree's right child = TreeInsert(Tree's right child, item)
return TreeBST Insert
This BST is initially empty, then we insert [4, 2, 6, 5, 1, 7, 3] in that order.
This BST is initially empty, then we insert [5, 6, 2, 3, 4, 7, 1] in that order.
This BST is initially empty, then we insert [1, 2, 3, 4] in that order.
BST Insertion is typically O(h), where h is the height of the BST. In general, the time complexity is simply the time it takes to traverse down to the place that the node needs to be inserted.
For a balanced tree, O(h) = O(log2(n))
Binary tree representations are very similar to Linked List structures, with one exception: Instead of only "1" next pointer, there are 2 - one for each child (left / right)
Abstract vs concrete data
typedef struct Node *Tree;
typedef int Item;BSTree.h
#include "BSTree.h"
typedef struct Node {
int data;
Tree left
Tree right;
} Node;BSTree.c
typedef struct Node *Tree;
typedef int Item;
Tree TreeCreate(Item it);
void TreeDestroy(Tree t);
Tree TreeInsert(Tree t, Item it);
void TreePrint(Tree t);
BSTree.h
#include "BSTree.h"
typedef struct Node {
int data;
Tree left
right;
} Node;
Tree TreeCreate(Item it) {
// TODO
}
void TreeDestroy(Tree t) {
// TODO
}
Tree TreeInsert(Tree t, Item it) {
// TODO
}
void TreePrint(Tree t) {
// TODO
}
BSTree.c
+ a makefile...
#include "BSTree.h"
int main(int argc, char* argv[]) {
Tree t = TreeCreate(1);
TreeInsert(t, 2);
TreePrint(t);
TreeInsert(t, 4);
TreePrint(t);
TreeInsert(t, 5);
TreePrint(t);
TreeInsert(t, 3);
TreePrint(t);
TreeDestroy(t);
return 0;
}main.c
There are 4 different ways to traverse a tree:
Inorder: 2 5 10 12 14 17 20 24 29 30 31 32
Postorder: 2 5 12 17 14 10 29 24 31 32 30 20
preorder
BSTTraverse(tree):
if tree is empty, return
print tree's data
BSTTraverse(tree's left child)
BSTTraverse(tree's right child)BSTTraverse(tree):
if tree is empty, return
BSTTraverse(tree's left child)
print tree's data
BSTTraverse(tree's right child)BSTTraverse(tree):
if tree is empty, return
BSTTraverse(tree's left child)
BSTTraverse(tree's right child)
print tree's data
inorder
postorder
BTS Traversals are fascinating because all 3 algorithms are content-wise the same, just structurally different.
BST Traversal for search is:
BST Traversal for printing is:
How do we join two trees?
t = TreeJoin(t1, t2)
Take two BSTs, join and return a single one that contains all items correctly ordered
Join does not guarantee to maintain a balanced tree.
Method:
Pseudocode
TreeJoin(tree1, tree2):
if tree1 is empty, return tree2
if tree2 is empty, return tree1
current = tree2
parent = NULL
while current's left child is not empty:
parent = current
current = current's left child
if parent is not NULL:
parent's left child = current's right child
current's right child = tree2
current's left child = tree1
return current (new root)BST Join is typically O(m), where m is the height of the right subtree.
Deleting from a binary tree is not as conceptually easy as some other tasks. There are 4 key cases to consider:
| Case | Case for a "node" to delete | Action |
|---|---|---|
| 1 | Empty tree | New tree is also empty |
| 2 | Zero subtrees | Unlink node from parent |
| 3 | One subtree | Replace by child |
| 4 | Two subtrees | Replace by successor, join two subtrees |
Deletion does not guarantee to maintain a balanced tree.
Well this is easy, just return NULL
This is also easy, just unlink the node from the parent and free the node.
A tiny bit harder, replace the node with its child, then free the original node.
Simply join the two subtrees that are left after you delete the node
For the node, its right child becomes new root, then attach the node's left subtree to the minimum element of the right subtree
Pseudocode
TreeDelete(tree,item):
if t is not empty:
if item < data(t):
left(t)=TreeDelete(left(t), item)
else if item > data(t):
right(t)=TreeDelete(right(t), item)
else:
if left(t) and right(t) are empty:
new = empty tree // 0 children
else if left(t) is empty:
new = right(t) // 1 child
else if right(t) is empty:
new = left(t) // 1 child
else:
new = TreeJoin(left(t), right(t)) // 2 children
free memory allocated for t
t = newBST Deletion is typically O(h), where h is the height of the BST. In general, the time complexity is simply the time it takes to traverse down to the place that the node needs to be deleted.
We can make use of C macros to abstract repeated code out and make our code easier to read.
// a Node contains its data, plus left and right subtrees
typedef struct Node {
int data;
Tree left, right;
} Node;
// some macros that we will use frequently
#define data(node) ((node)->data)
#define left(node) ((node)->left)
#define right(node) ((node)->right)BSTree.c