Trie Data Structure and Algorithms
Introductionβ
A Trie, also known as a prefix tree, is a type of search tree that is used to store a dynamic set of strings. The name "Trie" comes from the word "retrieval" as they are used to retrieve a key in a dataset of strings. Tries are particularly useful for tasks involving a large set of strings, such as dictionary words or contact lists.
Basic Structureβ
A Trie is a tree-like data structure where each node represents a single character of a string. The root node is associated with an empty string. Each subsequent node represents the next character of the inserted string. Nodes that share a common prefix share the same path from the root.
Exampleβ
root
/ | \
t b c
/| \ \
h o a a
/ \ \ \
e r t r
/ \ \
r k t
In the example above, the Trie stores the words "the", "their", "there", "bat", "cat", and "cart".
Operationsβ
Insertionβ
To insert a word into a Trie, we start at the root node and follow the path corresponding to the characters in the word. If a node for a character does not exist, we create a new node. After inserting all characters, we mark the last node as the end of a word.
Searchβ
To search for a word in a Trie, we follow the path corresponding to the characters in the word starting from the root. If we can follow the path for all characters and reach a node marked as the end of a word, the word exists in the Trie.
Deletionβ
To delete a word from a Trie, we first ensure the word exists. If it does, we remove the nodes corresponding to the word's characters, starting from the last character and moving towards the root. If removing a node would break the path for another word, we stop the deletion process for that node.
Prefix Searchβ
A Trie can efficiently find all words with a given prefix. By following the path corresponding to the prefix characters, we can reach a node where all the sub-nodes form words with the given prefix.
Applicationsβ
- Autocomplete: Tries are used to provide suggestions in search engines and text editors.
- Spell Checker: Tries store a large dictionary of words for quick lookup and suggestions.
- IP Routing: In networking, Tries can be used to store routing tables.
- Genome Analysis: Tries can store and search DNA sequences efficiently.
Advantages and Disadvantagesβ
Advantagesβ
- Fast Lookups: O(n) complexity for search, insertion, and deletion where n is the length of the word.
- Prefix Search: Efficiently find all words with a given prefix.
- Memory Efficiency: Space optimization for storing keys with common prefixes.
Disadvantagesβ
- Space Complexity: Requires more memory compared to some other data structures, especially for a sparse dataset.
- Implementation Complexity: More complex to implement than basic data structures like arrays or linked lists.
Complexity Analysisβ
- Insertion: O(n), where n is the length of the word.
- Search: O(n), where n is the length of the word.
- Deletion: O(n), where n is the length of the word.
- Space Complexity: O(ALPHABET*SIZE * n _ m), where n is the number of keys and m is the average key length.
Example Implementation in Pythonβ
class TrieNode:
def __init__(self):
self.children = {}
self.is_end_of_word = False
class Trie:
def __init__(self):
self.root = TrieNode()
def insert(self, word):
node = self.root
for char in word:
if char not in node.children:
node.children[char] = TrieNode()
node = node.children[char]
node.is_end_of_word = True
def search(self, word):
node = self.root
for char in word:
if char not in node.children:
return False
node = node.children[char]
return node.is_end_of_word
def delete(self, word):
def _delete(node, word, depth):
if not node:
return False
if depth == len(word):
if node.is_end_of_word:
node.is_end_of_word = False
return len(node.children) == 0
char = word[depth]
if char in node.children:
should_delete = _delete(node.children[char], word, depth + 1)
if should_delete:
del node.children[char]
return len(node.children) == 0
return False
_delete(self.root, word, 0)
def starts_with(self, prefix):
node = self.root
for char in prefix:
if char not in node.children:
return False
node = node.children[char]
return True
Referencesβ
- GeeksforGeeks - Trie
- Wikipedia - Trie
- TutorialsPoint - Trie