Merge Sort

Distribute Education like Computer Geek

Merge sort is generally called mergesort, and it is a very simple sort because there is no complexity in understanding how it works.

  1. The sort was invented by the American computer scientist John von Neumann in 1945.
  2. It is a comparison-based sort.
  3. It is an efficient and stable sort also.
  4. In the worst case, time complexity is nlogn, so it is the fastest worst case sorting algorithm. But the fastest average case sorting algorithm is quick sort.
  5. It uses divide-&-conquer approach. There are three steps in this approach.
(i) Divide

Suppose you have to find the solution to a problem. If the problem size is very small or the problem is in constant terms, then you can find the answer very easily in constant time. Example – T(1) = 1.

But if the problem is very large or the problem has variable terms, then you have to divide the problem into sub-problems.

Suppose the problem is an array that is unsorted, and you have to sort the unsorted array. Then you will divide.

Merge Sort Divide
Merge Sort Division
(ii) Conquer

Conquer means “to win” or “to succeed.” 

An unsorted problem is there, and we have to conquer it. We have to sort it. That is the meaning of this word, after that, we have to combine the individual solutions and make a sorted array of the unsorted problem.

(iii) Combine

The problem is divided into sub-problems to find the solution. After that, we have to combine the individual solutions to make a solution to the problem.

Example –

Sorting & Combination of array
Sorting & Combination of array

The conquer and combine steps run simultaneously, and the above is a top-to-bottom diagram. On the top, there are 8 indexes, which contain 8 values.

Index 1 and index 2 combine, index 3 and index 4 combine, index 5 and index 6 combine, and index 7 and index 8 combine.

In index 1, the value is 6, and in index 2, the value is 4. Let us say, we have to sort in the non-decreasing order, and 4 < 6. So, index 2 will come on the left hand side and index 1 will come on the right hand side in the second figure, and same goes with the other indexes.

Now, can you name the sort that sorts index 1 and index 2? Can you? This is not a mergesort, as we all very well know.

Sort

You will be surprised to know that this index sorts by the insertion sort. And the same way, in every index sorting, there is an insertion sort. But the whole diagram, the structure, and the idea that the array is divided in half until an individual value is obtained, plus the conquer and combine form, means that the array is doubled in every iteration from top to bottom, which is represented by merge sort.

Insertion sort average case time complexity is n2 if n is larger but in the smaller arrays it becomes very fast. So, no n2 shadow will lie on nlogn.

  1. if left < right

If the left item has an index less than the right item index, then we can perform a merge sort. If left == right happens, then we say that there is only 1 item in the array, and one item is always sorted.

 

  1. int mid <- (left + right)/2

Divide the array in half until we get individual items.

 

  1. MergeSort(Array[], left, mid)

Divide the left array until we get individual items.

 

  1. MergeSort(Array[], mid+1, right)

Divide the right array until we get individual items.

 

  1. Merge(Array[], left, mid, mid+1, right)

Merge all individual arrays into a combined array.

 

Program for merge sort in C

 

Time & Space Complexity of Merge Sort

In best case, worst case and average case, the time complexity is O(nlogn).

Sorting & Combination of array
Sorting & Combination of array

In the conquer and combine diagram, we saw that the iterations number performed is log n. When array is halved or the array is doubled, it means the number of iterations we perform is log n when number of items in array is n.

We perform a number of operations n and we are operating it log n time. So, O(nlogn) is the time complexity.

The merge sort space complexity is O(n) because an auxiliary array is there in the code that helps to sort the original array.

 

Advantage of Merge Sort
  1. This sort is the fastest worst case sort.
  2. Used in external sorting.
  3. In the input we have index 4 and index 6 values same, and obviously index 4 comes before index 6. The output of this sort has index 4 before the index 6, that’s why this sort is a stable sort.
Disadvantage of Merge Sort
  1. In the best case and average cases, it is not the fastest sort.
  2. The space complexity is O(n).

Test Yourself

Q1- Explain the working principle of Merge Sort.

Merge Sort is a divide-and-conquer sorting algorithm. It works by recursively dividing the input array into two halves, sorting each half separately, and then merging the sorted halves back together. This process continues until individual elements are reached, and then the merging process combines these elements to form a sorted array.

Q2- Compare Merge Sort and Quick Sort in terms of time complexity and stability.

Merge Sort and Quick Sort are both efficient sorting algorithms. Merge Sort has a time complexity of O(nlogn) in all cases (best, worst, and average). It is stable, meaning the relative order of equal elements is preserved in the sorted output. 

On the other hand, Quick Sort has an average-case time complexity of O(nlogn), but its worst-case time complexity can be O(n^2) for poorly chosen pivot elements. Quick Sort is not inherently stable.

Q3- Describe the process of the “Merge” step in Merge Sort.

The “Merge” step in Merge Sort is the process of combining two sorted arrays into a single sorted array. It involves comparing elements from both arrays and merging them in non-decreasing order. The two arrays should already be sorted before merging.

Q4- Why is Merge Sort preferred for sorting large datasets in external storage?

Merge Sort is preferred for sorting large datasets in external storage because it minimizes the number of disk I/O operations. External storage (like hard drives) has slower access times compared to internal memory (RAM). Merge Sort’s divide-and-conquer approach allows it to efficiently read and merge small chunks of data at a time, reducing the number of costly disk reads and writes.

Q5- Can Merge Sort be used for sorting data structures other than arrays? If yes, provide an example.

Yes, Merge Sort can be used to sort other data structures besides arrays. For example, it can be adapted to sort linked lists efficiently. In this case, instead of dividing the linked list physically, the algorithm would divide it logically by identifying the middle node and creating two separate linked lists. The merge step would then combine these sorted linked lists.

Q6- Discuss the advantages of Merge Sort over Bubble Sort and Selection Sort.

Merge Sort has two advantages over Bubble Sort and Selection Sort.

  1. Merge Sort has a time complexity of O(nlogn), making it much faster than the O(n^2) time complexity of Bubble Sort and Selection Sort in most cases.
  2. Merge Sort is a stable sort, meaning it preserves the relative order of equal elements, whereas Bubble Sort and Selection Sort are not inherently stable.
Q7- Explain why Merge Sort’s space complexity is considered to be O(n).

Merge Sort’s space complexity is O(n) because it requires an auxiliary array of the same size as the input array during the merging step. This auxiliary array is used to store the merged results temporarily. The space taken by the auxiliary array grows linearly with the size of the input array.

Q8- Is Merge Sort an in-place sorting algorithm? Justify your answer.

No, Merge Sort is not an in-place sorting algorithm. In an in-place sorting algorithm, the sorting is performed using a constant amount of additional memory, regardless of the size of the input. Merge Sort requires additional memory proportional to the size of the input array for the auxiliary array used during the merging step, making it non-in-place.

Q9- Can Merge Sort be easily adapted to sort elements in reverse order?

Yes, Merge Sort can be easily adapted to sort elements in reverse order. The only modification needed is to change the comparison condition during the merging step. Instead of merging elements in non-decreasing order, we would merge them in non-increasing order, effectively achieving a reverse sort.

Q10-Which algorithm is considered the fastest worst-case sorting algorithm?
  1. Quick Sort
  2. Merge Sort
  3. Bubble Sort
  4. Selection Sort

Ans – (2)

Explanation –  Merge Sort is the fastest worst-case sorting algorithm with a time complexity of O(nlogn).

Q11- Merge Sort was invented by:
  1. Alan Turing
  2. John von Neumann
  3. Charles Babbage
  4. Ada Lovelace

Ans – (2)

BOOKS

Algorithm T H CoremanAlgorithm by Udit AgarwalAlgorithm Ellis HorowitzData Structure & Algorithm Made Easy Narasimha