- Why Use Merge Sort?
- How Does Merge Sort Work? (Step-by-Step)
- Merge Sort – Step-by-Step Explanation
- Merge Sort Program in C
- Detailed Explanation of Code
- Key Functions
- 1. mergeSort(data, start, end)
- 2. merge(data, start, middle, end)
- Step-by-Step Execution (with Input: {38, 27, 43, 3, 9, 82, 10})
- Code Explanation
- Base Case of Recursion
- Tree Representation of Calls
- Time and Space Complexity Analysis
- Real-Life Examples of Merge Sort
- Advantages of Merge Sort
- Disadvantages of Merge Sort
- ❓FAQs About Merge Sort
- Q1. Why is Merge Sort better than Quick Sort sometimes?
- Q2. Why is Merge Sort good for linked lists?
- Q3. Is Merge Sort stable?
- Q4. Can Merge Sort be implemented in-place?
- Q5. What is the base case in merge sort recursion?
- Q6. Is Merge Sort used in real programming libraries?
- 📌 Summary Table
Merge Sort is a Divide and Conquer algorithm that divides the array into halves, recursively sorts each half, and then merges the sorted halves.
📖 It was invented by John von Neumann in 1945 and is considered one of the most efficient sorting algorithms for large datasets.
Why Use Merge Sort?
- Stable Sorting: Keeps original order of equal elements. It means If two elements in the array have the same value, Merge Sort keeps their original order in the sorted output.
- Predictable Time Complexity: Always O(n log n), No matter what the input looks like — sorted, reverse-sorted, random — merge sort always takes
O(n log n)
time. - Good for Linked Lists: As it doesn’t require random access, Merge Sort doesn’t require random access (like
arr[i]
), so it works well with data structures like Linked Lists, where you access elements by traversing nodes. - Best for External Sorting: When dealing with large files on disk, Sorting data that doesn’t fit in memory (e.g., huge log files, DB exports) and resides on disk or external storage.
How Does Merge Sort Work? (Step-by-Step)
Merge Sort follows three simple steps:
- Divide: Split the array into two halves
- Conquer: Sort each half using Merge Sort recursively
- Combine: Merge both sorted halves into one sorted array
Merge Sort – Step-by-Step Explanation
Let’s say we want to sort the following array:
int[] arr = {38, 27, 43, 3, 9, 82, 10};
1. Divide: Split the Array into Halves
This is where recursion begins.
- First, we divide the array into two halves (Left and Right)
- Keep dividing until each sub-array has only one element (which is by default sorted)
Steps of division:
[38, 27, 43, 3, 9, 82, 10]
➡ Split into
[38, 27, 43] and [3, 9, 82, 10]
➡ Further split:
[38] [27, 43] and [3, 9] [82, 10]
➡ Further split:
[27] [43] and [3] [9] [82] [10]
You keep dividing recursively until each part has a single element.
2. Conquer: Sort Each Half Using Merge Sort
- Now, we start solving the subproblems (i.e., sorting the smaller arrays)
- Since a single-element array is sorted, we begin merging and sorting these one-element arrays
Merging sorted parts:
Merge [27] and [43] ➡ [27, 43]
Merge [38] and [27, 43] ➡ [27, 38, 43]
Merge [3] and [9] ➡ [3, 9]
Merge [82] and [10] ➡ [10, 82]
Merge [3, 9] and [10, 82] ➡ [3, 9, 10, 82]
We recursively call mergeSort
() on both halves, and then sort them when merging.
3. Combine: Merge Both Sorted Halves
Finally, merge the two large sorted halves:
Left Sorted: [27, 38, 43]
Right Sorted: [3, 9, 10, 82]
Now merge them:
➡ Compare each element and build a sorted array:
[3, 9, 10, 27, 38, 43, 82] ✅
The merge()
function handles this part: comparing elements of both subarrays and combining them in sorted order.
Full Call Stack in Tree Form (Recursive Tree)
mergeSort([38, 27, 43, 3, 9, 82, 10])
|
|-- mergeSort([38, 27, 43])
| |-- mergeSort([38])
| |-- mergeSort([27, 43])
| |-- mergeSort([27])
| |-- mergeSort([43])
| |--> merge([27], [43]) → [27, 43]
| |--> merge([38], [27, 43]) → [27, 38, 43]
|
|-- mergeSort([3, 9, 82, 10])
|-- mergeSort([3, 9])
| |-- mergeSort([3])
| |-- mergeSort([9])
| |--> merge([3], [9]) → [3, 9]
|
|-- mergeSort([82, 10])
|-- mergeSort([82])
|-- mergeSort([10])
|--> merge([82], [10]) → [10, 82]
|--> merge([3, 9], [10, 82]) → [3, 9, 10, 82]
--> Final merge:
merge([27, 38, 43], [3, 9, 10, 82]) → [3, 9, 10, 27, 38, 43, 82]
Merge Sort Program in C
#include <stdio.h>
void merge(int data[], int start, int middle, int end) {
int leftSize = middle - start + 1;
int rightSize = end - middle;
int leftPart[leftSize], rightPart[rightSize];
for (int i = 0; i < leftSize; i++)
leftPart[i] = data[start + i];
for (int j = 0; j < rightSize; j++)
rightPart[j] = data[middle + 1 + j];
int i = 0, j = 0, k = start;
while (i < leftSize && j < rightSize) {
if (leftPart[i] <= rightPart[j])
data[k++] = leftPart[i++];
else
data[k++] = rightPart[j++];
}
while (i < leftSize)
data[k++] = leftPart[i++];
while (j < rightSize)
data[k++] = rightPart[j++];
}
void mergeSort(int data[], int start, int end) {
if (start < end) {
int middle = (start + end) / 2;
mergeSort(data, start, middle);
mergeSort(data, middle + 1, end);
merge(data, start, middle, end);
}
}
int main() {
int numbers[] = {38, 27, 43, 3, 9, 82, 10};
int size = sizeof(numbers) / sizeof(numbers[0]);
mergeSort(numbers, 0, size - 1);
printf("Sorted array: ");
for (int i = 0; i < size; i++)
printf("%d ", numbers[i]);
return 0;
}
Output
For the input:
int numbers[] = {38, 27, 43, 3, 9, 82, 10};
Output will be:
Sorted array: 3 9 10 27 38 43 82
Detailed Explanation of Code
Key Functions
1. mergeSort(data, start, end)
Purpose: Recursively divides the array into halves and calls merge()
to combine them.
2. merge(data, start, middle, end)
Purpose: Merges two sorted subarrays [start,middle]
and [middle+1,end]
into one sorted subarray.
Step-by-Step Execution (with Input: {38, 27, 43, 3, 9, 82, 10}
)
Let’s simulate how the array is divided:
Initial call: mergeSort(arr, 0, 6)
Divide into:
Left: mergeSort(arr, 0, 3)
-> mergeSort(arr, 0, 1)
-> mergeSort(arr, 0, 0) [base case, returns]
-> mergeSort(arr, 1, 1) [base case, returns]
-> merge(arr, 0, 0, 1) => Merges 38, 27 → 27, 38
-> mergeSort(arr, 2, 3)
-> mergeSort(arr, 2, 2) [returns]
-> mergeSort(arr, 3, 3) [returns]
-> merge(arr, 2, 2, 3) => Merges 43, 3 → 3, 43
-> merge(arr, 0, 1, 3) => Merges [27, 38] + [3, 43] → 3, 27, 38, 43
Right: mergeSort(arr, 4, 6)
-> mergeSort(arr, 4, 5)
-> mergeSort(arr, 4, 4) [returns]
-> mergeSort(arr, 5, 5) [returns]
-> merge(arr, 4, 4, 5) => Merges 9, 82 → 9, 82
-> mergeSort(arr, 6, 6) [returns]
-> merge(arr, 4, 5, 6) => Merges [9, 82] + [10] → 9, 10, 82
Final merge: merge(arr, 0, 3, 6)
-> Merges [3, 27, 38, 43] + [9, 10, 82] → Final sorted array
Code Explanation
void merge(int data[], int start, int middle, int end)
- This merges two sorted parts:
data[start,middle]
anddata[middle+1,end]
. leftSize
andrightSize
are sizes of the two halves.- Temporary arrays
leftPart[]
andrightPart[]
are filled with respective data. - Then, elements are compared and merged back to
data[]
.
while (i < leftSize && j < rightSize)
if (leftPart[i] <= rightPart[j])
data[k++] = leftPart[i++];
else
data[k++] = rightPart[j++];
- Compare current elements from both halves.
- Place smaller one in the main array.
- Move the pointer of the chosen element ahead.
while (i < leftSize)
data[k++] = leftPart[i++];
- If any elements left in
leftPart
, add them todata
.
while (j < rightSize)
data[k++] = rightPart[j++];
- If any elements left in
rightPart
, add them todata
.
Base Case of Recursion
if (start < end)
- When
start == end
, it means subarray has one element, which is already sorted. - So, recursion stops here (base case).
- No call to
merge()
is made in this case.
Tree Representation of Calls
mergeSort(0,6)
├── mergeSort(0,3)
│ ├── mergeSort(0,1)
│ │ ├── mergeSort(0,0)
│ │ └── mergeSort(1,1)
│ │ └── merge(0,0,1)
│ ├── mergeSort(2,3)
│ │ ├── mergeSort(2,2)
│ │ └── mergeSort(3,3)
│ │ └── merge(2,2,3)
│ └── merge(0,1,3)
├── mergeSort(4,6)
│ ├── mergeSort(4,5)
│ │ ├── mergeSort(4,4)
│ │ └── mergeSort(5,5)
│ │ └── merge(4,4,5)
│ └── mergeSort(6,6)
│ └── merge(4,5,6)
└── merge(0,3,6)
Time and Space Complexity Analysis
Case | Time Complexity |
---|---|
Best Case | O(n log n) |
Average | O(n log n) |
Worst Case | O(n log n) |
Space Complexity: O(n) – due to temporary arrays used during merge.
Real-Life Examples of Merge Sort
- Sorting Large Files (External Sorting)
When dealing with huge files that don’t fit into memory (like logs or databases), merge sort is used for external sorting since it handles large datasets efficiently using disk. - Merging Sorted Data Streams
Useful when combining data from two or more already sorted streams (like logs from different servers or merging contacts). - Linked List Sorting
Merge Sort is preferred over quick sort in sorting linked lists because it doesn’t require random access. - Used in Inversion Count Algorithm
To count how far (or close) the array is from being sorted. - Multi-threaded Sorting
Since merge sort is naturally recursive, it’s easy to parallelize, making it suitable for concurrent computing environments.
Advantages of Merge Sort
- Stable Sort: Maintains the relative order of equal elements.
- Consistent Time Complexity: Always takes O(n log n) time.
- Good for Linked Lists: Does not need random access, unlike quick sort.
- Best for External Sorting: Works well when data can’t be loaded into memory at once.
Disadvantages of Merge Sort
- Uses Extra Space: Requires O(n) additional space, making it inefficient for memory-constrained environments.
- Slower for Small Arrays: May perform worse than quick sort or insertion sort for small datasets due to overhead.
- Not In-Place: Unlike quicksort, it needs additional arrays during the merge process.
❓FAQs About Merge Sort
Q1. Why is Merge Sort better than Quick Sort sometimes?
Ans: Merge Sort guarantees O(n log n) time in all cases, while Quick Sort may degrade to O(n²) in the worst case. Merge Sort is also stable, unlike Quick Sort.
Q2. Why is Merge Sort good for linked lists?
Ans: Linked lists don’t support random access, which quicksort relies on. Merge Sort uses pointers to traverse, making it more efficient for linked lists.
Q3. Is Merge Sort stable?
Ans: Yes. Merge Sort maintains the original order of equal elements in the sorted output.
Q4. Can Merge Sort be implemented in-place?
Ans: Traditional Merge Sort is not in-place due to additional memory use. However, advanced in-place versions exist but are complex and rarely used in practice.
Q5. What is the base case in merge sort recursion?
Ans: When the subarray has one or zero elements, i.e., when start >= end
. At this point, the array is already sorted.
Q6. Is Merge Sort used in real programming libraries?
Ans: Yes, it’s used in some standard libraries (e.g., Python’s Timsort uses a modified Merge + Insertion sort).
📌 Summary Table
Feature | Merge Sort |
---|---|
Strategy | Divide and Conquer |
Time Complexity | O(n log n) |
Space Complexity | O(n) |
Stable | Yes |
In-place | No |
Best Use | Large data, linked lists, stable sort |