Merging K-Sorted Arrays and Linked Lists using Min Heap

Merging K-Sorted Arrays and Linked Lists using Min Heap

·

9 min read

Let's start off with a basic version of merging two sorted linked lists with the help of 2 pointer heads.

Input: l1 = [1,2,4], l2 = [1,3,4] Output: [1,1,2,3,4,4]

Let's assume the structure of the node as given below.

 // Definition for singly-linked list.
  struct ListNode {
      int val;
      ListNode *next;
      ListNode() : val(0), next(nullptr) {}
      ListNode(int x) : val(x), next(nullptr) {}
      ListNode(int x, ListNode *next) : val(x), next(next) {}
 * };

Following is the solution to merge 2 sorted linked lists.

  ListNode* mergeTwoLists(ListNode* l1, ListNode* l2) {
        ListNode head;
        ListNode *curr = &head;

        while(l1 != NULL and l2 != NULL) {
            if(l1->val <= l2->val) {
                curr->next = l1;
                curr = l1;
                l1 = l1->next;
            }
            else {
                curr->next = l2;
                curr = l2;
                l2 = l2->next;
            }
        }
        // straight away add the remaining elements
        while(l1 != NULL) {
            curr->next = l1;
            curr = l1;
            l1 = l1->next;
        }

        while(l2 != NULL) {
            curr->next = l2;
            curr = l2;
            l2 = l2->next;
        }
        return head.next;
    }

Keep the pointer at the starting head node of both the linked lists. The lesser of the two node values will be stored first. Move the pointer to the next node of the linked list if the current value is processed.

If one of the linked lists is fully processed then simply add the remaining elements of the other into the result.

Merging K-Sorted Linked Lists

The question becomes interesting if instead of 2 linked lists, you have to merge k-sorted linked lists.

Making k such pointers and if-else statements is not an efficient approach. Here's where the power of Min. Heap comes into play.

Algorithm

  1. Make a priority_queue [minHeap] of nodes of custom class ListNode.
  2. Push the first elements of all the linked lists to this minHeap.
  3. One by one , take nodes out of this priority queue and insert the value of val data member into the final sorted linked list.
  4. Also push the next element of the same linked list if it exists.
  5. Repeat this process until the minHeap becomes empty.
class Compare {
    public:
        bool operator()(ListNode *a, ListNode *b) {
            // minHeap implementation, larger elements will settle down
            return a->val > b->val; 
        }
};

   ListNode* mergeKLists(vector<ListNode*>& lists) {
        ListNode head;
        ListNode *curr = &head;

        priority_queue<ListNode*, vector<ListNode*>, Compare> minHeap;

        // insert the first node of all linked lists
        for(ListNode* list : lists) {
            if(list)
                minHeap.push(list);                
        }

        while(!minHeap.empty()) {
            ListNode *temp = minHeap.top();
            minHeap.pop();
            // move to the next element in the current list if present
            if(temp->next) 
                minHeap.push(temp->next);

            // add the popped out min node to the final list
            curr->next = temp;
            curr = temp;
        }
        return head.next; // pointer to the first node in the linked list
    }

When using a custom class, we need to define the priority level of the node values explicitly. This can be easily done by operator overloading of the () to provide the comparisons based on val data member in the node.

Min Heap will always keep the minimum priority node at its top. Initially, we will push all the head nodes in the min Heap as the least among all first nodes elements will the first element(least) in the final merged linked list.

The reason for the if(list) condition is to avoid segmentation fault, as empty linked lists are also allowed.

if(temp->next) 
     minHeap.push(temp->next);

We are implicitly maintaining k-pointers to the linked lists. After popping the minimum node from the top of the heap, we add the next node of the linked list if it exists.

We keep repeating this process until the min Heap becomes empty ie. nodes from all the linked lists are processed.

Hope, you understood this concept. The similar approach works well for arrays.

Merging K-sorted Arrays

Problem Statement: Merge K-sorted arrays each of size n.

Algorithm

  1. Make a priority_queue [minHeap] where each node stores three values - the element itself , its row no. and its col no.
  2. Push the first elements of all the rows into this minHeap with their corresponding row and column info.
  3. One by one , take nodes out of this priority queue and insert their values into the final sorted array.
  4. Also push the next element of the same row and next column using the other information into the minHeap.
  5. Repeat this process until the minHeap becomes empty.

Following is the full implementation of the code.

#include <iostream>
#include <vector>
#include <queue>
using namespace std;

// pair to store element, row_num, col_num
#define ppi pair<int, pair<int, int>> 

vector<int> mergeKSortedArrays(vector<vector<int>> &arr) {
    int k = arr.size();
    int n = arr[0].size();
    vector<int> merged;

    // create a minHeap
    priority_queue<ppi, vector<ppi>, greater<ppi>> minHeap;

    // push the first ie. smallest element of all the k arrays
    for(int i = 0; i < k; i++) 
        minHeap.push({arr[i][0], {i, 0}});

    while(!minHeap.empty()) {
        // pop the smallest element and insert row, col + 1 of that element if col + 1 < n or col < n - 1
        ppi temp = minHeap.top();
        minHeap.pop();
        merged.push_back(temp.first);
        int r = temp.second.first;
        int c = temp.second.second;
        if(c + 1 < n) 
            minHeap.push({arr[r][c + 1], {r, c + 1}});
    }
    return merged;
}

int main() {

    int k, n;
    cin >> k >> n;

    // int arr[n][k];
    vector< vector<int> > arr(k, vector<int>(n));
    for(int i = 0; i < k; i++) {
        for(int j = 0; j < n; j++) 
            cin >> arr[i][j];
    }

    vector<int> merged = mergeKSortedArrays(arr);

    // print the merged array
    for(int x : merged) 
        cout << x << " ";

    return 0;
}

To retrieve information similar to a linked list, we need to store the current row and column numbers.

For storing {value, {row_number, column_number}} we use pair of pair.

#define ppi pair<int, pair<int, int>>

We keep popping the smallest element from the min Heap and insert {row, col + 1} of that element if col + 1 < n or col < n - 1 ie. the next element in that particular array if it exists.

We could now solve many variations of this question, as the logic remains the same, only the implementation part alters.

Hope, you liked this article. As a homework part, try to implementing merging of K-sorted arrays of different sizes.

In case of any queries or suggestions feel free to connect with me on github and linkedin.