Trie

Distribute Education like Computer Geek

“Trie,” pronounced “tri” or “tree,” is a data structure used to represent a group of strings. It supports fast pattern matching.

It can be visualized like a tree consist of edges and nodes and uses the prefixes of a string.

Prefix of a string “Tarun” is

‘T’

‘Ta’

‘Tar’

‘Taru’

‘Tarun’

Edward Fredkin was the first man who coined the term “trie” in the 1960s. Another name for trie is “digital tree,” “prefix tree,” or “radix tree.”

A data structure is used for storing and retrieving data. The word “trie” derives from the word “retrieval“.

Some of you may believe that hashing is the best data retrieval technique, but trie is a much faster algorithm than hashing.

Properties of trie

It has the shape of multi-way tree.
The root is always null.
Each internal node stores one letter of the string.
Each internal node is alphabetically ordered.
Each leaf node stores a special symbol ‘$’ that is end of the string.

Types of tries

Standard trie
Compressed trie
Suffix trie

Standard Trie

Standard trie is a set of strings in which

Each internal node has one letter of the string in alphabetical ordering.
The path from the root to the distinct leaf represents a string.

Input – {BARK, BAT, NINE, NOT, NOTE}

If a string is the prefix of another string, then we can add the $ sign after completion of it. {NOT, NOTE}. It is termed as ‘Handling Keys’.

Compressed Trie

Compressed trie is trie that is compressed with the redundant nodes.

Input – {BARK, BAT, NINE, NOT, NOTE}

Suffix Trie

A suffix trie is just like a compressed trie, but it also contains all the unique suffixes of a string. It is a space-efficient data structure.

Input – {ZOOM}

Suffixes are – {ZOOM$, OOM$, OM$, M$, $}

Operations on Trie

Searching
Insertion
Deletion

Searching

If you want to search a string in a trie, then you have to start from the root node.
There will be some internal nodes linking to the root node. The first letter of the string is to be searched in those internal nodes.
If it is found then that node will be called reference node.
Then we will search for second letter of the string linked to the reference node.
While doing this, if we get the complete string, then we will call it a successful search.
If not then we will see the next internal node which is linked to the root node and start searching the string with first letter.
An unsuccessful search occurs when you traverse the full trie without finding the string.

Search – {NOTE$}

In searching,

Time complexity is O(n) where n is the string we search.
Space complexity is O(1).

Insertion

If you want to insert a string in a trie, then you have to start from the root node.
Search the string and if it is successful, then no need to insert duplicate string.
If search is unsuccessful, then make a new node link to the root node and insert the first letter of the string.
While creating a new node, check whether it is according to the dictionary or not.
If string has k letters, create k nodes and insert the string’s letters as you move down.
Finally, make a node for the special symbol $ that connects to the string’s last letter node.

Insert – {GOD$}

In insertion,

Time complexity is O(n) where n is the string we insert.
Space complexity is O(n) because we need ‘n’ more storage to insert.

Deletion

If the string that is to be deleted is not found in the trie, then you can exit.
If you found the string, delete it.
One thing is to note be noted that if the string overlaps another string, then don’t delete the overlapping part.

Delete – {NOT$}

In deletion,

Time complexity is O(n) where n is the string we delete.
Space complexity is O(1).

Application

Longest prefix match or String matching can be identified between many strings.
Trie is a quick search engine that also serves as a spell checker for misspelt words.
The Auto Complete tool suggests whole words when you input something in a chat, message, or search engine. This autocomplete feature is used by Google and WhatsApp. When we search “Cat” in Google, the autocomplete phrases “caterpillar,” “category,” and “catalyst” appear.
Lexicographic sorting can be achieved by trie.

Advantages

Fast insert and retrieve.
There are alphabetically sorted strings like dictionary that can be used in many purposes.
Faster than hashing.

Disadvantages

More memory is used.

Test Yourself

Q1 – Explain what a trie data structure is and how it represents a group of strings.

Answer

A trie is a tree-based data structure used to represent a group of strings. It supports fast pattern matching and is visualized as a tree consisting of edges and nodes. Each internal node stores one letter of the string, and the path from the root to the distinct leaf represents a string. It is also known as a digital tree, prefix tree, or radix tree.

Q2 – Describe the properties of a standard trie and how it handles keys.

Answer

In a standard trie, each internal node contains one letter of the string in alphabetical order. The path from the root to a leaf node represents a string. If a string is the prefix of another string, a special symbol ‘$‘ is added after it to handle keys.

Q3 – What is a compressed trie, and how does it differ from a standard trie?

Answer

A compressed trie is a trie that is compressed with redundant nodes. It optimizes space by removing nodes that only have one child. In contrast, a standard trie retains all nodes even if they have only one child.

Q4 – Explain what a suffix trie is and how it differs from a compressed trie.

Answer

A suffix trie is similar to a compressed trie but also contains all the unique suffixes of a string. It efficiently represents all possible substrings of a string, making it a space-efficient data structure.

Q5 – Describe the operations that can be performed on a trie data structure.

Answer

The operations on a trie include

searching – involves finding a string in the trie
insertion – adds a new string to the trie
deletion – removes a string from the trie.

Q6 – Explain the process of inserting a string into a trie and analyze its time and space complexities.

Answer

To insert a string into a trie, you start from the root and search for the string. If it is not found, you create new nodes for each letter of the string, connecting them to the root. Finally, you add a special symbol ‘$’ as a leaf node to indicate the end of the string. The time complexity of insertion is O(n), where n is the length of the string being inserted. The space complexity is O(n) because ‘n’ additional storage is required to insert the string.

Q7 – Describe the steps involved in deleting a string from a trie and analyze its time and space complexities.

Answer

To delete a string from a trie, you first search for the string. If found, you remove the nodes corresponding to the string’s letters and the ‘$‘ symbol. However, if the string overlaps with another string, you only delete the non-overlapping part.

The time complexity of deletion is O(n), where n is the length of the string being deleted.

The space complexity is O(1).

Q8 – Discuss the applications of trie data structure in real-world scenarios.

Answer

Tries are used for

longest prefix matching,
string matching,
autocomplete suggestions,
spell checking,
lexicographic sorting.

Q9 – What are the advantages and disadvantages of using a trie data structure?

Answer

Advantages of a trie include fast insertion and retrieval, alphabetically sorted strings, and faster performance compared to hashing.

The main disadvantage is that it uses more memory due to its tree-like structure.

Q10- The root of a trie is always:

Empty
Containing a special symbol ‘$’
Containing the first letter of the strings
Null

Answer

Ans – (4)

Explanation – The root of a trie is always null.

Q11 – Which type of trie contains all unique suffixes of a string?

Standard trie
Compressed trie
Suffix trie
Radix trie

Answer

Ans – (3)

Explanation – Suffix trie contains all unique suffixes of a string.

Q12 – What is the primary advantage of a trie over hashing for data retrieval?

Faster search operations
Simpler insertion operations
Lower space complexity
Efficient management of large databases

Answer

Ans – (1)

Explanation – Tries offer faster search operations compared to hashing.

Q13 – What is the space complexity of a compressed trie with ‘n’ strings?

O(n)
O(n log n)
O(log n)
O(1)

Answer

Ans – (1)

Explanation – The space complexity of a compressed trie with ‘n’ strings is O(n).

Trie

Properties of trie

Types of tries

Standard trie

Compressed trie

Suffix trie

Standard Trie

Compressed Trie

Suffix Trie

Operations on Trie

Searching

Insertion

Deletion

Searching

Search – {NOTE$}

Insertion

Deletion

Delete – {NOT$}

Application

Advantages

Disadvantages

Test Yourself

Q1 – Explain what a trie data structure is and how it represents a group of strings.

Q2 – Describe the properties of a standard trie and how it handles keys.

Q3 – What is a compressed trie, and how does it differ from a standard trie?

Q4 – Explain what a suffix trie is and how it differs from a compressed trie.

Q5 – Describe the operations that can be performed on a trie data structure.

Q6 – Explain the process of inserting a string into a trie and analyze its time and space complexities.

Q7 – Describe the steps involved in deleting a string from a trie and analyze its time and space complexities.

Q8 – Discuss the applications of trie data structure in real-world scenarios.

Q9 – What are the advantages and disadvantages of using a trie data structure?

Q10- The root of a trie is always:

Empty

Containing a special symbol ‘$’

Containing the first letter of the strings

Null

Q11 – Which type of trie contains all unique suffixes of a string?

Standard trie

Compressed trie

Suffix trie

Radix trie

Q12 – What is the primary advantage of a trie over hashing for data retrieval?

Faster search operations

Simpler insertion operations

Lower space complexity

Efficient management of large databases

Q13 – What is the space complexity of a compressed trie with ‘n’ strings?

O(n)

O(n log n)

O(log n)

O(1)

BOOKS