Union and Intersection of 2 Sorted Arrays with duplicates

Union and Intersection of 2 Sorted Arrays with duplicates

·

7 min read

Problem Statement: Given 2 sorted arrays of different sizes with duplicates, find the union and intersection of the arrays.

Let's start off with the intersection of 2 sorted arrays. It is a fairly easy problem but one needs to take care of the edge cases.

1. Intersection of 2 Sorted Arrays

vector<int> printIntersection(int arr1[], int arr2[], int N, int M) { 
    vector<int> result;
    int i = 0, j = 0;

    while(i < N && j < M) {
        while(i + 1 < N && arr1[i] == arr1[i + 1])
            i++;
        while(j + 1 < M && arr2[j] == arr2[j + 1])
            j++;

        if(arr1[i] == arr2[j]) {
            result.push_back(arr1[i]);
            i++;
            j++;
        }
        else if(arr1[i] < arr2[j])
            i++;
        else
            j++;

    }
    if(result.empty()) {
        result.push_back(-1);
        return result;
    }
    return result;
}

The efficient solution involves the use of 2 indices initialized at the first element of each array. We will consider the smaller of the 2 pointed values and push it in the resultant array. The index of the array from which the smaller element is selected is then incremented by 1 step.

However, this would lead to the duplication of elements as intersection should only contain unique elements.

while(i + 1 < N && arr1[i] == arr1[i + 1])
            i++;

To avoid this issue, we will increase the index if the adjacent elements are same ie. we will take into account, the last unique occurrence of each array element.

i + 1 < N indicates that this loop continues till the second last element, arr[i + 1] doesn't exist for the last index.

Via this while loop, we could easily handle duplicates efficiently. The duplicates will always occur in adjacent pairs as the array is sorted.

Now, if the elements in both the arrays are equal, then push the element in the resultant vector and increase the indices for both the arrays.

Contrary to this, if the element of the first array is smaller than the element of the second, we push this smaller element to the result vector and increase the index for the array through which the smaller element was extracted, vice-versa.

if(result.empty()) {
        result.push_back(-1);
        return result;
    }

If no element is pushed to the result vector, then return -1. Since, the return type of this function is vector<int>, we can't return an integer directly, hence push -1 onto the vector and then return it.

while(i < N && j < M), here if one of the array is completely processed, then there is no need to move ahead, as we need to just calculate the intersection of the two arrays ie. elements which are common to both the arrays.

We can extend the similar idea in finding out the union of the arrays. Here, we need to include all the elements belonging to either of the arrays.

Note: The result vector should contain unique elements only.

2. Union of 2 Sorted Arrays

vector<int> findUnion(int arr1[], int arr2[], int n, int m) {
    vector<int> result;

    int i = 0, j = 0;

    while(i < n and j < m) {
        while(i + 1 < n and arr1[i] == arr1[i + 1])
            i++;
        while(j + 1 < m and arr2[j] == arr2[j + 1])
            j++;

        if(arr1[i] == arr2[j]) {
            result.push_back(arr1[i++]);
            j++;
        }
        else if(arr1[i] < arr2[j]) 
            result.push_back(arr1[i++]);
        else 
            result.push_back(arr2[j++]);

    }
    while(i < n) {
        if(i + 1 == n or arr1[i] != arr1[i + 1])
            result.push_back(arr1[i]);
        i++;
    } 

    while(j < m) {
        if(j + 1 == m or arr2[j] != arr2[j + 1]) 
            result.push_back(arr2[j]);
        j++;
    }
    return result;
}

The only difference is to push the element if it is smaller than element of the other array. Since, we are calculating the union, we need to include all the elements.

However, the interesting part lies in this condition while(i < n and j < m). If one of the array is fully processed then we need to add all the unique elements of the other array to the final result vector.

Example:
arr = {1, 2, 3, 4};

The last element 4 will be a part of the result vector.

arr = {1, 1, 2, 3, 3, 4, 4, 4};

The values of 4 at indices: 5 and 6 won't be taken into account, because we incremented the index value when adjacent duplicates are encountered.
Hence, the last index element 4 will be a part of the result vector.

Since we are adding the last occurrence of an element in case of duplicates, we will always add the last element.

 while(i < n) {
        if(i + 1 == n or arr1[i] != arr1[i + 1])
            result.push_back(arr1[i]);
        i++;
    }

Hence, we would be taking the unique elements from the remaining portion of the array when the other other is fully processed.

Hope, you liked this article. To practice this concept, try to implement the union and intersection of 2 linked lists with duplicates.

In case of any issues, feel free to connect with me on linkedin and github.