give two sorted array, find the k-th smallest element of union of A and B

最新推荐文章于 2022-04-16 19:48:04 发布

原创最新推荐文章于 2022-04-16 19:48:04 发布 · 1.9k 阅读

1 ·

CC 4.0 BY-SA版权

算法专栏收录该内容

46 篇文章

订阅专栏

本文探讨了从两个已排序的数组中找到第K小的元素的问题，提出了三种解决方案：暴力解法、改进版的K大小选择算法以及最优解——一种复杂度为O(log m + log n)的方法。

Give you two sorted array, find the k-th smallest elements of union of A and B, you can assume that there are no duplicate elements. the size of A is m, and size of B is n, both them are in acsending order.

At first, we can use brute-force, malloc a new array of size m+n, and merge A and B into this new array, then we can get the k-th smallest element, or k-th smallest elements. time complexity is O(m+n), and the space complexity is O(m+n);

A better way, using two pointers: i, and j, i proints head of A, j points to head of B, at the same time make a new malloc of array of size k. then we compare A[i] and B[j], we get the minimum of them and increase pointer of the minimum one in order to get next index of the array its belonged to. time complexity is O(k), and space complexity is O(k). here is the codes:

#include<iostream>
#include<cassert>
using namespace std;

int *get_k_th_minimum(int *a, int len_a, int *b, int len_b, int k) {
	int p = 0;
	int i = 0;
	int j = 0;
	int *c = (int*)malloc(sizeof(int) * k);
	if(a == NULL || b == NULL || k > (len_a + len_b)) {
		return NULL;
	}
	while(p < k && i < len_a && j < len_b) {
		if(a[i] < b[j]) {
			c[p++] = a[i];
			i++;
		} else {
			c[p++] = b[j];
			j++;
		}
	}
	return c;
}

void main() {
	int a[] = {1, 3, 5, 7, 9};
	int b[] = {2, 4, 6, 8, 10};
	int len_a = sizeof(a) / sizeof(int);
	int len_b = sizeof(b) / sizeof(int);
	int k = 5;
	int *p = get_k_th_minimum(a, len_a, b, len_b, k);
	int i = 0;
	if(p) {
		for(i = 0; i < k; i++) {
			cout << p[i] << " ";
		}
		cout << endl;
	}
	getchar();
}

Then a more better way, it takes O(lgm + lgn) time to get the k-th minimum elements, not all the number of k elements. then we will take O(k) time to get these elements from A and B. And space complexity is O(1).

We try to approach this tricky problem by comparing middle elements of A and B, which we identify as Ai and Bj. If Ai is between Bj-1 and Bj, it shows that Ai is the i+j+1(index is from zero) smallest element, including Bj-1 there are j elements in B smaller than Ai, and there are i elements smaller than Ai. so Ai is the i+j+1 smallest element in union of A and B. Therefore,if we choose i and j such that i + j + 1 = k. then we find the k-th smallest element. This is an important invariant that we must maintain for the correctness of this algorithm.

summerizing above is : Maintaining the invariant i + j = k - 1 if Bj-1 < Ai < Bj, then Ai must be the k-th smallest . if Ai-1 < Bj < Ai, then Bj must be the k-th smallest.

If one of the conditions above is satisfied, then we get k-th smallest element. If both are not, we have to consider what we should do next? First, if we assume this condition: Bj-1 < Ai < Bj is not satisfied, and we can get two reasons of this failure: one is Ai > Bj, and the other is Ai < Bj-1, then we also assume that this condition: Ai-1 < Bj < Ai is also not satisfied. Then we can clearly get that is: Ai < Bj-1 makes Ai or Bj is not the k-th smallest element.

We make a summerization than when Ai < Bj, then Ai < Bj-1, on the other hand, if Bj < Ai, then Bj < Ai-1.

Using the above relationship, we can get Ai and its lower portion could never be the k-th smallest element. Let me make a explanation, if we assume that

Bj-2< Ai < Bj-1, then Ai will be the i+1+j-1 smallest elements,i.e (k-1)th, and this is the best situation and how about Ai < Bj-2, or Ai < Bj-3, and how about Ai's lower portion? so we discard Ai and its lower portion and why not Bj and its lower portion, think about that is there any possibility that Ai+1 < Bj-1 < Ai+2, if this can be true, then Bj-1 is the k-th smallest element. And to Bj's uper portion we also discard, why? because Ai < Bj < Bj+1, and there are total j+1 + i + 1 elements smaller than Bj+1. so we discard it.

On the other hand, the case for A_i > B_j is just the other way around. Easy.

Below is the code and I have inserted lots of assertion (highly recommended programming style by the way) to help you understand the code. Note that the below code is an example of tail recursion, so you could technically convert it to an iterative method in a straightforward manner. However, I would leave it as it is, since this is how I derive the solution and it seemed more natural to be expressed in a recursive manner.

Another side note is regarding the choices of i and j. The below code would subdivide both arrays using its array sizes as weights. The reason is it might be able to guess the k-th element quicker (as long as the A and B is not differed in an extreme way; ie, all elements in A are smaller than B). If you are wondering, yes, you could choosei to be A’s middle. In theory, you could choose any values for i and j as long as the invariant i+j = k-1 is satisfied.

here is the code:

#include<iostream>
#include<cassert>
using namespace std;

int flag = 0;

int get_k_th_minimum(int *a, int m, int *b, int n, int k) {
	assert(m >= 0);
	assert(n >= 0);
	assert(k <= (m + n));
	int i = (int)((double)m / (m + n) * (k + 1));
	int j = (k - 1) - i;
	assert(i >= 0);
	assert(j >= 0);
	assert(i <= m);
	assert(j <= n);
	int Ai_1 = ((i == 0) ? INT_MIN : a[i-1]);
    int Bj_1 = ((j == 0) ? INT_MIN : b[j-1]);
    int Ai   = ((i == m) ? INT_MAX : a[i]);
    int Bj   = ((j == n) ? INT_MAX : b[j]);
	if(Bj_1 < Ai && Ai < Bj) {
		flag = 0;
		return Ai;
	} else if(Ai_1 < Bj && Bj < Ai) {
		flag = 1;
		return Bj;
	}
	assert((Ai > Bj && Ai_1 > Bj) || (Ai < Bj && Ai < Bj_1));
	if(Ai < Bj) {
		get_k_th_minimum(a + i + 1, m - i - 1, b, j, k - i - 1);
	} else {
		get_k_th_minimum(a, i, b + j + 1, n - j - 1, k - j - 1);
	}
}


void main() {
	int i = 0;
	int j = 0;
	int a[] = {1, 3, 5, 7, 9};
	int b[] = {2, 4, 6, 8, 10};
	int len_a = sizeof(a) / sizeof(int);
	int len_b = sizeof(b) / sizeof(int);
	int k = 6;
	int result = get_k_th_minimum(a, len_a - 1, b, len_b - 1, k);
	if(flag == 0) {
		while(a[i] != result) {
			cout << a[i] << " ";
			i++;
		}
		for(j = 0; j < k - i - 1; j++) {
			cout << b[j] << " ";
		}
		cout << result << endl;
	} else {
		while(b[i] != result) {
			cout << b[i] << " ";
			i++;
		}
		for(j = 0; j < k - i - 1; j++) {
			cout << a[j] << " ";
		}
		cout << result << endl;
	}
	getchar();
}