Tries
David Anderson
Recurse Center W1 '16
Tries
- From Retrieval
- Symbol Table (Similar API to Binary Search Trees (BST) and Hash Tables)
- Sometimes called prefix trees
- Easily search for values stored for alphanumeric keys - often without examining the entire key!
R-way Tries
Simplest implementation.
Key is single value of the alphabet.
Each node has R children, where R is the number of possible values (null nodes not pictured).
Can be costly for space complexity (think Unicode - 65, 536-way Trie!)
Symbol Table API
Operation | Returns | Description |
---|---|---|
put(key, val) | N/A | add a new value for given key |
get(key) | value | retrieve value paired with key |
delete(key) | N/A | delete key and corresponding value |
keys() | iterable of keys | all keys |
keysWithPrefix(s) | iterable of keys | keys having s in the beginning |
keysThatMatch(s) | iterable of keys | keys that match s (wildcards possible) |
longestPrefixOf(s) | key | longest key that is a starts with s |
Time Complexity
implementation | search hit | search miss | insert | space (references) |
---|---|---|---|---|
red-black BST | L+ c lg^2 N | c lg^2 N | c lg^2 N | 4 N |
hasing (linear probing) | L | L | L | 4N to 16N |
R-way trie | L | logR N | L | (R + 1) N |
TST | L + ln N | ln N | L + ln N | 4 N |
TST w/ R^2 Root | L + ln N | ln N | L + ln N | 4 N + R^2 |
Uses
- Word prediction (keyword completion, T9 texting)
- Prefix Matching, longest prefix (ex. Computational Biology databases (BLAST, FASTA), network search, IP routing, XML search)
Variants - Ternary Search Tries (TST)
- More efficient memory usage
- Each node has 3 children.
- Key for each entry is single value.
TST w/ R^2 Branching At Root
- Parent node has R^2 children (every combination of 2 letters from the key
- Improve memory usage through de-duplication
Advanced Variants
- PATRICIA trie aka crit-bit tree or radix tree
- Practical Algorithm to Retrieve Information Coded in Alphanumeric (phew)
- Remove one-way branching, each node represents sequence of characters
- Suffix Tree
- Patricia Tree of suffixes for string (rather than prefixes).
- Locate substrings quickly, matches for regular expressions, linear time longest common substring.
- Tradeoff on storage.
References
Tries
By dvndrsn
Tries
- 780