Skip List

Distribute Education like Computer Geek

An Efficient Data Structure for Searching and Sorting

In the world of algorithms and data structures, skip lists stand out as an efficient and versatile data structure.

Skip-lists provide a balanced compromise between a sorted list and a linked list offering fast search, insertion, and deletion operations.

William Worthington Pugh Jr. is an American computer scientist who invented the skip list in 1989.

In this post, we will explore the concept of skip lists, their structure, and how they can be utilized in the design and analysis of algorithms.

What is a Skip List?

A skip list is a probabilistic data structure that allows for efficient searching, insertion, and deletion operations in a sorted list.
It consists of multiple layers or levels, with the bottom level representing the original sorted list.
It skips many elements from the whole list, in one upper layer, which is why it is known as a skip list.
The topmost layer acts as an “express lane” because it contains fewer elements than the layers below it.
Each layer below has progressively more elements than the layer above it.

By utilizing these additional layers, skip lists can effectively “skip” over a large number of elements in a single step during searches, reducing the average time complexity.

The skip list achieves this by creating multiple levels of linked lists, where each element in a lower level has a pointer to the next occurrence of the same element in the level above. This structure allows for efficient traversal and search operations by taking advantage of the skip pointers.

Skip List Structure

The elements are arranged in the levels. At the bottom most level, there is a single linked list containing all the elements. As we move up the levels, some elements are “skipped” or bypassed, hence the name “skip list“. Each element on a particular level has pointers to the next element with a higher key value, effectively creating shortcuts to traverse the list more quickly.

Searching

Searching in a skip list is similar to searching in a binary search tree but with the added advantage of faster average-case time complexity.

Starting from the top level, we start from the ‘HEAD’ and compare the target key with the current element. If the target key is greater, we move to the next element on the same level. If it is smaller, we move down to the next level and continue the search. This process continues until we either find the target element or reach the bottom level.

To search for the element 50 in the skip list you provided, we’ll start at the topmost layer and traverse down through the levels until we find the element or determine that it’s not present.

Let’s go step by step:

In figure 1, Start at the topmost layer (LANE 1), which contains fewer elements:

Move from Head to 10. (10 < 50)
Move from 10 to 70 through express lane. (70 > 50)
At this point, we reach the bottom layer (LANE 2) through 10.

Move from 10 to 40. (40 < 50)
Move from 40 to 70. (70 > 50)
At this point, we reach the bottom layer (LANE 3) through 40.

Move from 40 to 60. (60 > 50)
At this point, we reach the bottom layer (LANE 4) through 40.

Move from 40 to 50. (50 = 50)
Our search is Successful.

In this case, the search process required traversing two elements: 10 and 40, before reaching the element 50. The skip list allowed us to skip elements that were not necessary to examine (say 20 and 30), making the search faster compared to a linear search in a regular linked list.

Algorithm for Searching in a skip list

1. Start at the top-left corner of the skip list (the head node).
2. Move in a horizontal manner and compare the key of the node with the target key:

If the current node’s key is equal to the target key, return the node or indicate that the key has been found.
If the key of the current node is less than the target key, move horizontally to the next node on the same lane.
If the key of the current node is greater than the target key, move to its previous node and vertically down to the next lane.

Repeat step 2 until one of the following conditions is met:

You have reached the bottom level and there are no more nodes to explore. In this case, the target key is not present in the skip list means unsuccessful search.
You have found a node with a key equal to the target key. In this case, the key has been found in the skip list, means successful search.

Before inserting and deleting a skip list, I would like to explain two concepts of a skip list.

Perfect Skip List

A perfect skip list is a type of skip list that guarantees a specific structure.
In a perfect skip list, each level has exactly half the number of nodes as the level below it.
This results in a balanced and predictable structure.
The topmost level contains just two nodes (One is starting node and second is last node), and each subsequent level doubles the number of nodes compared to the level below it.
Inserting and deleting operation could not be done in perfect skip list. If we delete 50 from the perfect skip list, the diagram is completely messed up as after deleting LANE 3 has 7 elements and LANE 2 has only 3 elements.

Advantage of a perfect skip list

It provides optimal search performance.
The time complexity for search operations in a perfect skip list is O(log n), just like in a balanced binary search tree. However, maintaining the perfect structure of a skip list during insertions and deletions can be more complex and may require additional operations to rebalance the levels.

Randomized Skip List

A randomized skip list is a type of skip list that introduces randomness into the structure.
The level of each node is determined randomly during the insertion process, following a certain probability distribution.
This randomization creates an unpredictable structure that may vary between different instances of the skip list.

Insertion in Randomized Skip List

The coin randomization process is used to determine the level of each node during the insertion process.

Coin Randomization in Randomized Skip List

The coin flip is based on a probability distribution that determines the node level.

A simple explanation of the coin randomization technique follows:

1. Begin with a new node that must be added to the skip list.

2. Using a coin flip, determine the node’s level:

2.1 New node is placed in the original lane.

2.2 Toss a coin.

3. If head comes, then place new node in upper layer.

4. If tail comes, then stop the promotion in upper layers.

4.1 You have to toss the coin until you reach the maximum lane, if head comes again and again.

Deletion in Randomized Skip List

To maintain the skip list’s structure and order after deletion, we update the pointers at each level accordingly.

Algorithm for deletion in a Skip List

Search the target key using algorithm of search.
If a node with the target key is found:
Remove the node from each level of the skip list by adjusting the pointers of the surrounding nodes:
Update the pointers of the previous node and the next node to bypass the node to be deleted.
Free the memory allocated to the node.

Repeat step 3 if there are duplicate nodes with the target key in the skip list.

Advantage of Randomized Skip list

This makes insertion operations faster and easier to implement. However, the randomness can result in slightly worse search performance compared to a perfect skip list, although the average-case time complexity is still O(log n).

Difference between Perfect and Randomized Skip list

Aspect	Perfect Skip List	Randomized Skip List
Structure	Balanced and predictable structure	Random and less predictable structure
Node Distribution	Each level has exactly half the number of nodes as the level below	Randomly assigned levels based on a probability
Insertion Complexity	More complex insertion process to maintain perfect structure	Simpler insertion process, no need for structure balance
Search Performance	Optimal search performance, O(logn) time complexity	Slightly worse search performance, still O(logn)
Predictability	Predictable structure with a specific number of nodes per level	Unpredictable structure, can vary between instances
Rebalancing Requirement	Requires additional operations to maintain the perfect structure, which takes more time	No need for rebalancing, levels assigned randomly

Time Complexity Analysis

The actual time complexity for inserting a node into the skip list depends on the number of levels it is promoted to. On average, the number of levels a node is promoted to is determined by the probability distribution used.

In perfect or randomized skip list, the average-case time complexity for search, insertion, and deletion operations in a skip list is O(log n), where n is the number of elements in the list, because after each layer the elements are halved in the upper layer.

The Worst-Case time complexity for searching, inserting and deletion is O(n). If we assume a simple 50/50 distribution, where each level has a 50% chance of being promoted due to the coin flipping, the average number of levels a node is promoted to is approximately No. of elements*(1/2).

In this case, the time complexity for a node would be O(n/2), which simplifies to O(n).

This is comparable to a balanced binary search tree.

Space Complexity of Skip List

The average space complexity of a skip list is O(n), where n is the number of elements in the skip list. This is because each element requires a node to store its key and value, and the number of nodes scales linearly with the number of elements.

Advantages of Skip Lists

(1) Efficient Search– Skip lists offer efficient search operations with an average time complexity of O(log n), similar to balanced binary search trees. This makes them suitable for applications that require fast searching.

(2) Simple Implementation– Skip lists have a relatively simple implementation compared to other data structures like balanced trees. They do not require complex balancing operations, making them easier to understand and implement.

(3) Dynamic Structure– Skip lists support dynamic operations such as insertion and deletion with good performance. They can be modified easily without significant structural changes.

(4) Space Efficiency– Skip lists utilize additional levels to create shortcuts, but they still maintain a good space efficiency. The average space complexity of skip lists is O(n), where n is the number of elements.

(5) Balance– Skip lists automatically balance themselves during insertion and deletion operations, ensuring good overall performance.

Disadvantages of Skip Lists

1. Skip lists use additional levels to create shortcuts, resulting in some space overhead compared to simpler data structures like linked lists. However, the overhead is usually reasonable and does not affect the overall performance significantly.

2. Randomized skip lists, in particular, introduce randomness into the structure. While this simplifies the insertion process, it also makes the structure less predictable and may result in slightly worse search performance compared to perfect skip lists.

3. Skip lists are commonly used for in-memory data structures, but they may not be the best choice for all scenarios. For certain specialized applications or when specific requirements such as cache efficiency or strict worst-case time bounds are necessary, other data structures may be more suitable.

4. Skip lists are one of many data structures available, each with its own strengths and weaknesses. The choice of using skip lists should be based on the specific requirements and trade-offs of the application at hand.

Applications of Skip Lists

1. Database indexing and query optimization.
2. Priority queues and scheduling algorithms.
3. File systems and caching mechanisms.
4. Network routing algorithms.
5. Skip graph data structures.

Test Yourself

Q1- How does the structure of a skip list contribute to its efficiency in search operations?

Answer

The structure of a skip list, with multiple levels and skip pointers, allows for efficient search operations. By utilizing additional layers and skip pointers, a skip list can “skip” over a large number of elements in a single step during searches, reducing the average time complexity. This structure creates shortcuts that enable faster traversal and search operations compared to linear searches in regular linked lists.

Q2- What is the difference between a perfect skip list and a randomized skip list?

Answer

A perfect skip list has a specific structure where each level has exactly half the number of nodes as the level below it, resulting in a balanced and predictable structure.

On the other hand, a randomized skip list introduces randomness into the structure, with node levels determined randomly during the insertion process. This randomization creates an unpredictable structure that may vary between different instances of the skip list.

Q3- How does the coin randomization process work in a randomized skip list?

Answer

The coin randomization process in a randomized skip list is used to determine the level of each node during the insertion process. A coin is flipped, and if it lands on heads, the node is placed in the upper layer. If it lands on tails, the promotion to upper layers stops. The coin is flipped repeatedly until the maximum level is reached if heads keep coming up.

Q4- What are the advantages of skip lists compared to other data structures?

Answer

Skip lists offer efficient search, insertion, and deletion operations with an average time complexity of O(log n), similar to balanced binary search trees. They have a relatively simple implementation and do not require complex balancing operations. Skip lists also support dynamic operations such as insertion and deletion without significant structural changes. They maintain a good space efficiency with an average space complexity of O(n), and they automatically balance themselves during insertions and deletions.

Q5- Can perfect skip lists support insertion and deletion operations? Why or why not?

Answer

Perfect skip lists cannot support insertion and deletion operations easily. If an element is inserted or deleted in a perfect skip list, it would disrupt the balanced structure of the skip list. The levels would no longer have exactly half the number of nodes as the level below, leading to an unbalanced and unpredictable structure. Therefore, maintaining the perfect structure during insertions and deletions in a perfect skip list is complex and may require additional operations to rebalance the levels.

Q6- How does the time complexity of skip lists compare to other data structures like linked lists or balanced binary search trees?

Answer

The average-case time complexity of search, insertion, and deletion operations in skip lists is O(log n), where n is the number of elements in the list. This is comparable to balanced binary search trees and provides efficient operations. However, the worst-case time complexity for search, insertion, and deletion in skip lists is O(n), similar to linked lists. It is important to note that the average-case time complexity is typically more relevant and makes skip lists suitable for applications requiring fast searching.

Q7- What are some applications of skip lists in real-world scenarios?

Answer

Skip lists have various applications, including database indexing and query optimization, priority queues and scheduling algorithms, file systems and caching mechanisms, network routing algorithms, and skip graph data structures. They are suitable for scenarios that require efficient searching and dynamic operations.

Q8- How does the structure of a skip list contribute to its space efficiency?

Answer

Skip lists maintain a good space efficiency with an average space complexity of O(n), where n is the number of elements. Although skip lists use additional levels to create shortcuts, the number of nodes scales linearly with the number of elements. This makes skip lists space-efficient compared to some other data structures like balanced trees that may require additional memory for balancing operations.

Q9- What are the trade-offs of using skip lists compared to other data structures?

Answer

While skip lists offer efficient operations and a simple implementation, they do have some trade-offs. They may have slightly worse search performance compared to perfect skip lists due to the randomization in randomized skip lists. Skip lists also have some space overhead compared to simpler data structures like linked lists. Additionally, skip lists may not be the best choice for specialized applications or scenarios that require specific requirements such as cache efficiency or strict worst-case time bounds.

Q10- Can skip lists be used for external storage, such as disk-based storage systems?

Answer

Skip lists are commonly used for in-memory data structures. However, their suitability for external storage systems, such as disk-based storage, may depend on factors such as disk access times and the ability to efficiently handle disk-based operations. Other data structures designed specifically for external storage, like B-trees, may be more suitable in such cases.

Skip List

An Efficient Data Structure for Searching and Sorting

What is a Skip List?

Skip List Structure

Searching

Algorithm for Searching in a skip list

Perfect Skip List

Advantage of a perfect skip list

Randomized Skip List

Insertion in Randomized Skip List

Coin Randomization in Randomized Skip List

Deletion in Randomized Skip List

Algorithm for deletion in a Skip List

Advantage of Randomized Skip list

Difference between Perfect and Randomized Skip list

Time Complexity Analysis

Space Complexity of Skip List

Advantages of Skip Lists

Disadvantages of Skip Lists

Applications of Skip Lists

Test Yourself

Q1- How does the structure of a skip list contribute to its efficiency in search operations?

Q2- What is the difference between a perfect skip list and a randomized skip list?

Q3- How does the coin randomization process work in a randomized skip list?

Q4- What are the advantages of skip lists compared to other data structures?

Q5- Can perfect skip lists support insertion and deletion operations? Why or why not?

Q6- How does the time complexity of skip lists compare to other data structures like linked lists or balanced binary search trees?

Q7- What are some applications of skip lists in real-world scenarios?

Q8- How does the structure of a skip list contribute to its space efficiency?

Q9- What are the trade-offs of using skip lists compared to other data structures?

Q10- Can skip lists be used for external storage, such as disk-based storage systems?

BOOKS