give two sorted array, find the k-th smallest element of union of A and B

本文探讨了从两个已排序的数组中找到第K小的元素的问题,提出了三种解决方案:暴力解法、改进版的K大小选择算法以及最优解——一种复杂度为O(log m + log n)的方法。

Give you two sorted array, find the k-th smallest elements of union of A and B, you can assume that there are no duplicate elements. the size of A is m, and size of B is n, both them are in acsending order.


At first, we can use brute-force,  malloc a new array of size m+n, and merge A and B into this new array, then we can get the k-th smallest element, or k-th smallest elements. time complexity is O(m+n), and the space complexity is O(m+n);


A better way, using two pointers: i, and j, i proints head of A, j points to head of B, at the same time make a new malloc of array of size k. then we compare A[i] and B[j], we get the minimum of them and increase pointer of the minimum one in order to get next index of the array its belonged to. time complexity is O(k), and space complexity is O(k). here is the codes:

#include<iostream>
#include<cassert>
using namespace std;

int *get_k_th_minimum(int *a, int len_a, int *b, int len_b, int k) {
	int p = 0;
	int i = 0;
	int j = 0;
	int *c = (int*)malloc(sizeof(int) * k);
	if(a == NULL || b == NULL || k > (len_a + len_b)) {
		return NULL;
	}
	while(p < k && i < len_a && j < len_b) {
		if(a[i] < b[j]) {
			c[p++] = a[i];
			i++;
		} else {
			c[p++] = b[j];
			j++;
		}
	}
	return c;
}

void main() {
	int a[] = {1, 3, 5, 7, 9};
	int b[] = {2, 4, 6, 8, 10};
	int len_a = sizeof(a) / sizeof(int);
	int len_b = sizeof(b) / sizeof(int);
	int k = 5;
	int *p = get_k_th_minimum(a, len_a, b, len_b, k);
	int i = 0;
	if(p) {
		for(i = 0; i < k; i++) {
			cout << p[i] << " ";
		}
		cout << endl;
	}
	getchar();
}

Then a more better way, it takes O(lgm + lgn) time to get the k-th minimum elements, not all the number of k elements. then we will take O(k) time to get these elements from A and B. And space complexity is O(1).
We try to approach this tricky problem by comparing middle elements of  A and B, which we identify as Ai and Bj.  If Ai is between Bj-1 and Bj, it shows that Ai is the i+j+1(index is from zero) smallest element, including Bj-1 there are j elements in B smaller than Ai, and there are i elements smaller than Ai. so Ai is the i+j+1 smallest element in union of A and B. Therefore,if we choose i and j such that i + j + 1 = k. then we find the k-th smallest element. This is an important invariant that we must maintain for the correctness of this algorithm.
summerizing above is : Maintaining the invariant i + j = k - 1 if Bj-1 < Ai < Bj, then Ai must be the k-th smallest . if Ai-1 < Bj < Ai, then Bj must be the k-th smallest.
If one of the conditions above is satisfied, then we get k-th smallest element. If both are not, we have to consider what we should do next? First, if we assume this condition: Bj-1 < Ai < Bj is not satisfied, and we can get two reasons of this failure: one is Ai > Bj, and the other is Ai < Bj-1, then we also assume that this condition: Ai-1 < Bj < Ai is also not satisfied. Then we can clearly get that is: Ai < Bj-1 makes Ai or Bj is not the k-th smallest element.
 We make a summerization than when Ai < Bj, then Ai < Bj-1, on the other hand, if Bj < Ai, then Bj < Ai-1.
Using the above relationship, we can get Ai and its lower portion could never be the k-th smallest element. Let me make a explanation, if we assume that
 Bj-2< Ai < Bj-1, then Ai will be the i+1+j-1 smallest elements,i.e (k-1)th, and this is the best situation and how about Ai < Bj-2, or Ai < Bj-3, and how about Ai's lower portion? so we discard Ai and its lower portion and why not Bj and its lower portion, think about that is there any possibility that Ai+1 < Bj-1 < Ai+2, if this can be true, then Bj-1 is the k-th smallest element. And to Bj's uper portion we also discard, why? because Ai < Bj < Bj+1, and there are total j+1 + i + 1 elements smaller than Bj+1. so we discard it.
On the other hand, the case for Ai > Bj is just the other way around. Easy.
Below is the code and I have inserted lots of assertion (highly recommended programming style by the way) to help you understand the code. Note that the below code is an example of tail recursion, so you could technically convert it to an iterative method in a straightforward manner. However, I would leave it as it is, since this is how I derive the solution and it seemed more natural to be expressed in a recursive manner.

Another side note is regarding the choices of i and j. The below code would subdivide both arrays using its array sizes as weights. The reason is it might be able to guess the k-th element quicker (as long as the A and B is not differed in an extreme way; ie, all elements in A are smaller than B). If you are wondering, yes, you could choosei to be A’s middle. In theory, you could choose any values for i and j as long as the invariant i+j = k-1 is satisfied.

here is the code: 

#include<iostream>
#include<cassert>
using namespace std;

int flag = 0;

int get_k_th_minimum(int *a, int m, int *b, int n, int k) {
	assert(m >= 0);
	assert(n >= 0);
	assert(k <= (m + n));
	int i = (int)((double)m / (m + n) * (k + 1));
	int j = (k - 1) - i;
	assert(i >= 0);
	assert(j >= 0);
	assert(i <= m);
	assert(j <= n);
	int Ai_1 = ((i == 0) ? INT_MIN : a[i-1]);
    int Bj_1 = ((j == 0) ? INT_MIN : b[j-1]);
    int Ai   = ((i == m) ? INT_MAX : a[i]);
    int Bj   = ((j == n) ? INT_MAX : b[j]);
	if(Bj_1 < Ai && Ai < Bj) {
		flag = 0;
		return Ai;
	} else if(Ai_1 < Bj && Bj < Ai) {
		flag = 1;
		return Bj;
	}
	assert((Ai > Bj && Ai_1 > Bj) || (Ai < Bj && Ai < Bj_1));
	if(Ai < Bj) {
		get_k_th_minimum(a + i + 1, m - i - 1, b, j, k - i - 1);
	} else {
		get_k_th_minimum(a, i, b + j + 1, n - j - 1, k - j - 1);
	}
}


void main() {
	int i = 0;
	int j = 0;
	int a[] = {1, 3, 5, 7, 9};
	int b[] = {2, 4, 6, 8, 10};
	int len_a = sizeof(a) / sizeof(int);
	int len_b = sizeof(b) / sizeof(int);
	int k = 6;
	int result = get_k_th_minimum(a, len_a - 1, b, len_b - 1, k);
	if(flag == 0) {
		while(a[i] != result) {
			cout << a[i] << " ";
			i++;
		}
		for(j = 0; j < k - i - 1; j++) {
			cout << b[j] << " ";
		}
		cout << result << endl;
	} else {
		while(b[i] != result) {
			cout << b[i] << " ";
			i++;
		}
		for(j = 0; j < k - i - 1; j++) {
			cout << a[j] << " ";
		}
		cout << result << endl;
	}
	getchar();
}


为了实现一个时间复杂度为 $O(n \lg k)$ 的算法来列出一个 $n$ 元素集合的第 $k$ 个分位数,可以采用分治的策略。基本思路是通过递归地找到中间的分位数,将问题不断分解为更小的子问题。 以下是该算法的 Python 代码实现: ```python import random def kth_quantiles(arr, k): n = len(arr) if k == 1: return [] if k == 2: median = select(arr, n // 2) return [median] # 计算要找的分位数的位置 positions = [i * n // k for i in range(1, k)] return find_quantiles(arr, positions) def find_quantiles(arr, positions): if not positions: return [] mid_pos = len(positions) // 2 mid_index = positions[mid_pos] # 找到第 mid_index 小的元素 mid_value = select(arr, mid_index) # 分割数组 left = [x for x in arr if x < mid_value] right = [x for x in arr if x > mid_value] # 递归处理左右子问题 left_positions = [pos for pos in positions if pos < mid_index] right_positions = [pos - len(left) - 1 for pos in positions if pos > mid_index] left_quantiles = find_quantiles(left, left_positions) right_quantiles = find_quantiles(right, right_positions) return left_quantiles + [mid_value] + right_quantiles def select(arr, k): if len(arr) == 1: return arr[0] # 随机选择一个基准 pivot = random.choice(arr) left = [x for x in arr if x < pivot] middle = [x for x in arr if x == pivot] right = [x for x in arr if x > pivot] if k < len(left): return select(left, k) elif k < len(left) + len(middle): return pivot else: return select(right, k - len(left) - len(middle)) # 测试代码 arr = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5] k = 4 result = kth_quantiles(arr, k) print(result) ``` ### 代码解释: 1. **`kth_quantiles` 函数**:该函数是主函数,用于处理输入数组 `arr` 和分位数 `k`。如果 `k` 为 1,则返回空列表;如果 `k` 为 2,则直接找到中位数。否则,计算要找的分位数的位置,并调用 `find_quantiles` 函数。 2. **`find_quantiles` 函数**:该函数是递归函数,用于找到指定位置的分位数。首先找到中间位置的分位数,然后将数组分割为左右两部分,递归处理左右子问题。 3. **`select` 函数**:该函数用于找到数组中第 `k` 小的元素,采用随机选择基准的方法,平均时间复杂度为 $O(n)$。 ### 复杂度分析: - **时间复杂度**:每次递归将问题规模缩小一半,递归深度为 $O(\lg k)$,每次递归需要 $O(n)$ 的时间来分割数组,因此总的时间复杂度为 $O(n \lg k)$。 - **空间复杂度**:主要是递归调用栈的空间,空间复杂度为 $O(\lg k)$。
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值