Distribute Education like Computer Geek

“Trie,” pronounced “tri” or “tree,” is a data structure used to represent a group of strings. It supports fast pattern matching.

It can be visualized like a tree consist of edges and nodes and uses the prefixes of a string.

Prefix of a string “Tarun” is

‘T’

‘Ta’

‘Tar’

‘Taru’

‘Tarun’

Edward Fredkin was the first man who coined the term “trie” in the 1960s. Another name for trie is “digital tree,” “prefix tree,” or “radix tree.”

A data structure is used for storing and retrieving data. The word “trie” derives from the word “retrieval“.

Some of you may believe that hashing is the best data retrieval technique, but trie is a much faster algorithm than hashing.

.

Properties of trie
  1. It has the shape of multi-way tree.
  2. The root is always null.
  3. Each internal node stores one letter of the string.
  4. Each internal node is alphabetically ordered.
  5. Each leaf node stores a special symbol ‘$’ that is end of the string.

 

Types of tries

  1. Standard trie
  2. Compressed trie
  3. Suffix trie
  1. Standard Trie

Standard trie is a set of strings in which

  • Each internal node has one letter of the string in alphabetical ordering.
  • The path from the root to the distinct leaf represents a string.

Input – {BARK, BAT, NINE, NOT, NOTE}

Standard Trie
Standard Trie

If a string is the prefix of another string, then we can add the $ sign after completion of it. {NOT, NOTE}. It is termed as ‘Handling Keys’.

.

  1. Compressed Trie

Compressed trie is trie that is compressed with the redundant nodes.

Input – {BARK, BAT, NINE, NOT, NOTE}

Compressed Trie
Compressed Trie

.

  1. Suffix Trie

A suffix trie is just like a compressed trie, but it also contains all the unique suffixes of a string. It is a space-efficient data structure.

Input – {ZOOM}

Suffixes are – {ZOOM$, OOM$, OM$, M$, $}

Suffix trie without sorting
Suffix trie without sorting

.

Suffix trie with sorting
Suffix trie with sorting

.

Operations on Trie

  1. Searching
  2. Insertion
  3. Deletion
 
  1. Searching
  • If you want to search a string in a trie, then you have to start from the root node.
  • There will be some internal nodes linking to the root node. The first letter of the string is to be searched in those internal nodes.
  • If it is found then that node will be called reference node.
  • Then we will search for second letter of the string linked to the reference node.
  • While doing this, if we get the complete string, then we will call it a successful search.
  • If not then we will see the next internal node which is linked to the root node and start searching the string with first letter.
  • An unsuccessful search occurs when you traverse the full trie without finding the string.
Search – {NOTE$}
Search {NOTE$} in Trie
Search {NOTE$} in Trie

In searching,

  • Time complexity is O(n) where n is the string we search.
  • Space complexity is O(1).
  1. Insertion
  • If you want to insert a string in a trie, then you have to start from the root node.
  • Search the string and if it is successful, then no need to insert duplicate string.
  • If search is unsuccessful, then make a new node link to the root node and insert the first letter of the string.
  • While creating a new node, check whether it is according to the dictionary or not.
  • If string has k letters, create k nodes and insert the string’s letters as you move down.
  • Finally, make a node for the special symbol $ that connects to the string’s last letter node.

Insert – {GOD$}

Insert {GOD$}
Insert {GOD$}

In insertion,

  • Time complexity is O(n) where n is the string we insert.
  • Space complexity is O(n) because we need ‘n’ more storage to insert.
  1. Deletion
  • If the string that is to be deleted is not found in the trie, then you can exit.
  • If you found the string, delete it.
  • One thing is to note be noted that if the string overlaps another string, then don’t delete the overlapping part.
Delete – {NOT$}
String {NOT$} is deleted
String {NOT$} is deleted in Trie

In deletion,

  • Time complexity is O(n) where n is the string we delete.
  • Space complexity is O(1).

Application

  • Longest prefix match or String matching can be identified between many strings.
  • Trie is a quick search engine that also serves as a spell checker for misspelt words.
  • The Auto Complete tool suggests whole words when you input something in a chat, message, or search engine. This autocomplete feature is used by Google and WhatsApp. When we search “Cat” in Google, the autocomplete phrases “caterpillar,” “category,” and “catalyst” appear.
  • Lexicographic sorting can be achieved by trie.

Advantages

  • Fast insert and retrieve.
  • There are alphabetically sorted strings like dictionary that can be used in many purposes.
  • Faster than hashing.

Disadvantages

  • More memory is used.

Test Yourself

Q1 – Explain what a trie data structure is and how it represents a group of strings.

A trie is a tree-based data structure used to represent a group of strings. It supports fast pattern matching and is visualized as a tree consisting of edges and nodes. Each internal node stores one letter of the string, and the path from the root to the distinct leaf represents a string. It is also known as a digital tree, prefix tree, or radix tree.

Q2 – Describe the properties of a standard trie and how it handles keys.

In a standard trie, each internal node contains one letter of the string in alphabetical order. The path from the root to a leaf node represents a string. If a string is the prefix of another string, a special symbol ‘$‘ is added after it to handle keys.

Q3 – What is a compressed trie, and how does it differ from a standard trie?

A compressed trie is a trie that is compressed with redundant nodes. It optimizes space by removing nodes that only have one child. In contrast, a standard trie retains all nodes even if they have only one child.

Q4 – Explain what a suffix trie is and how it differs from a compressed trie.

A suffix trie is similar to a compressed trie but also contains all the unique suffixes of a string. It efficiently represents all possible substrings of a string, making it a space-efficient data structure.

Q5 – Describe the operations that can be performed on a trie data structure.

The operations on a trie include

  1. searching – involves finding a string in the trie
  2. insertion – adds a new string to the trie
  3. deletion – removes a string from the trie.
Q6 – Explain the process of inserting a string into a trie and analyze its time and space complexities.

To insert a string into a trie, you start from the root and search for the string. If it is not found, you create new nodes for each letter of the string, connecting them to the root. Finally, you add a special symbol ‘$’ as a leaf node to indicate the end of the string. The time complexity of insertion is O(n), where n is the length of the string being inserted. The space complexity is O(n) because ‘n’ additional storage is required to insert the string.

Q7 – Describe the steps involved in deleting a string from a trie and analyze its time and space complexities.

To delete a string from a trie, you first search for the string. If found, you remove the nodes corresponding to the string’s letters and the ‘$‘ symbol. However, if the string overlaps with another string, you only delete the non-overlapping part. 

The time complexity of deletion is O(n), where n is the length of the string being deleted. 

The space complexity is O(1).

Q8 – Discuss the applications of trie data structure in real-world scenarios.

Tries are used for

  1. longest prefix matching,
  2. string matching,
  3. autocomplete suggestions,
  4. spell checking,
  5. lexicographic sorting.
Q9 – What are the advantages and disadvantages of using a trie data structure?

Advantages of a trie include fast insertion and retrieval, alphabetically sorted strings, and faster performance compared to hashing. 

The main disadvantage is that it uses more memory due to its tree-like structure.

Q10- The root of a trie is always:
  1. Empty
  2. Containing a special symbol ‘$’
  3. Containing the first letter of the strings
  4. Null

Ans – (4)

Explanation – The root of a trie is always null.

Q11 – Which type of trie contains all unique suffixes of a string?
  1. Standard trie
  2. Compressed trie
  3. Suffix trie
  4. Radix trie

Ans – (3)

Explanation – Suffix trie contains all unique suffixes of a string.

Q12 – What is the primary advantage of a trie over hashing for data retrieval?
  1. Faster search operations
  2. Simpler insertion operations
  3. Lower space complexity
  4. Efficient management of large databases

Ans – (1)

Explanation – Tries offer faster search operations compared to hashing.

Q13 – What is the space complexity of a compressed trie with ‘n’ strings?
  1. O(n)
  2. O(n log n)
  3. O(log n)
  4. O(1)

Ans – (1)

Explanation – The space complexity of a compressed trie with ‘n’ strings is O(n).

BOOKS

Algorithm T H CoremanAlgorithm by Udit AgarwalAlgorithm Ellis HorowitzData Structure & Algorithm Made Easy Narasimha