Intersection of Multiple Arrays Sorted Unsorted

最新推荐文章于 2022-01-12 12:32:18 发布

转载最新推荐文章于 2022-01-12 12:32:18 发布 · 309 阅读

本文探讨了多个已排序数组及未排序数组的交集算法，包括双指针法、部分排序加二分查找以及哈希表等方法，并讨论了不同算法的时间复杂度。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Let's first talk about just two arrays.

If the arrays are sorted, everything becomes easier:

/* Function prints Intersection of arr1[] and arr2[]
   m is the number of elements in arr1[]
   n is the number of elements in arr2[] */
int printIntersection(int arr1[], int arr2[], int m, int n)
{
  int i = 0, j = 0;
  while (i < m && j < n)
  {
    if (arr1[i] < arr2[j])
      i++;
    else if (arr2[j] < arr1[i])
      j++;
    else /* if arr1[i] == arr2[j] */
    {
      printf(" %d ", arr2[j++]);
      i++;
    }
  }
}

Source: http://www.geeksforgeeks.org/union-and-intersection-of-two-sorted-arrays-2/

It's pretty straight forward that if the current element of array1 is less than the current element of array2, just increment the index of array by 1, and try the equal test again. If we do get equal values, just increment both indices. If the value of array2 is larger just, just increment its index. By doing so, we gradually iterate through two arrays. The time complexity is O(i+j) the sum of length of both arrays. If the two arrays are not sorted, the time complexity needs to include the time O(iLogi + jLogj) for sorting.

A second way of solving the problem when the two arrays are not sorted is a partial sorting + binary search.

func intersection2(arr1, arr2):
  intersect = { }
  sort(arr2)
  for i = 0 to arr1.length:
    if binarySearch(arr2, arr1[i]): // returns true if arr1[i] is in arr2
      intersect.add(arr1[i])
  return intersect

This is also quite simple, and it is more efficient when one array is smaller than another one. Instead of sorting both arrays, we just sort the shorter array, and run a binary targeting elements from the unsorted array. The time complexity is O(iLogi + jLogi) or O((i+j)Logi). When i is significantly smaller than j, this algorithm is more efficient than sorting both arrays.

Another method is of course using hashing. Assume the time complexity of retrieving element from hashing is O(1). The time complexity of caching elements from one array, and searching elements from another array against the cache is O(i + j).

Reference: http://www.geeksforgeeks.org/find-union-and-intersection-of-two-unsorted-arrays/

Things get more interesting when we come to multiple arrays.

A found a good article about finding overlaps of three arrays.I wasn't really expecting the first article I would like to introduce is from GreeksforGreeks. But it seems I can't find similar questions either on LeetCode or LintCode. As usual, you get more detailed presentation of the question, more discussion of the the idea. But on the other hand, the time complexity is not very well discussed.

  while (i < n1 && j < n2 && k < n3)
    {
         // If x = y and y = z, print any of them and move ahead 
         // in all arrays
         if (ar1[i] == ar2[j] && ar2[j] == ar3[k])
         {   cout << ar1[i] << " ";   i++; j++; k++; }
 
         // x < y
         else if (ar1[i] < ar2[j])
             i++;
 
         // y < z
         else if (ar2[j] < ar3[k])
             j++;
 
         // We reach here when x > y and z < y, i.e., z is smallest
         else
             k++;
    }

The key part of this solution is how to apply the method of finding overlaps of two sorted arrays to three arrays. The solution above solves the process of finding the smallest element from three arrays in a very elegant way.

If array1 < array2 , just increment index of array1,

if array1 > array2, just increment index of array2,

when none of the above conditions are true means that they are equal so just increment index of array3.

It's no doubts the complexity is O(i + j + k).

Finally, a more general solution for n sorted arrays coming from stackoverflow. The idea is to find overlaps of two arrays at each time and consider the result as a new array, and find the overlaps between this new array and a third array from the rest.

Reference: http://stackoverflow.com/questions/5630685/efficient-algorithm-to-produce-the-n-way-intersection-of-sorted-arrays-in-c

http://www.geeksforgeeks.org/find-common-elements-three-sorted-arrays/