给定两个大小为 m 和 n 的有序数组 nums1
和 nums2
。
请你找出这两个有序数组的中位数,并且要求算法的时间复杂度为 O(log(m + n))。
你可以假设 nums1
和 nums2
不会同时为空。
示例 1:
nums1 = [1, 3] nums2 = [2] 则中位数是 2.0
示例 2:
nums1 = [1, 2] nums2 = [3, 4] 则中位数是 (2 + 3)/2 = 2.5
思路:这篇是转载discuss排名第一的解答,做了翻译。
为了解答这个问题,我们需要理解什么是“中值”,统计学上的定义为:中值是把一个集合分为长度相等的两堆,其中一堆的值永远比另一堆大。
首先,我们把A随机的在位置i分为两个堆:
left_A | right_A
A[0], A[1], ..., A[i-1] | A[i], A[i+1], ..., A[m-1]
A有m个元素,所以有m+1种划分方式(i=0~m),且我们知道:len(left_A) = i, len(right_A) = m - i 。当i=0时,left_A为空,当i=m时,right_A为空。
同理,我们可以把数组B随机的在位置j划分为两个堆:
left_B | right_B
B[0], B[1], ..., B[j-1] | B[j], B[j+1], ..., B[n-1]
现在我们把两个数组A和B统一在一起(对数组A在位置i划分,对数组B在位置j划分):
left_part | right_part
A[0], A[1], ..., A[i-1] | A[i], A[i+1], ..., A[m-1]
B[0], B[1], ..., B[j-1] | B[j], B[j+1], ..., B[n-1]
且我们要保证:
1) len(left_part) == len(right_part)
2) max(left_part) <= min(right_part)
如果保证划分的两边的left_part和right_part长度是相等的,那么中值就等于如下的公式:
median = (max(left_part) + min(right_part))/2.
为了保证上述的条件成立,我们需要有以下约束:
(1) i + j == m - i + n - j (or: m - i + n - j + 1)
if n >= m, we just need to set: i = 0 ~ m, j = (m + n + 1)/2 - i
(2) B[j-1] <= A[i] and A[i-1] <= B[j]
(对于边界问题我们暂不讨论)
那么我们的二分类查找法等于:
<1> Set imin = 0, imax = m, then start searching in [imin, imax]
<2> Set i = (imin + imax)/2, j = (m + n + 1)/2 - i
<3> Now we have len(left_part)==len(right_part). And there are only 3 situations
that we may encounter:
<a> B[j-1] <= A[i] and A[i-1] <= B[j]
Means we have found the object `i`, so stop searching.
<b> B[j-1] > A[i]
Means A[i] is too small. We must `ajust` i to get `B[j-1] <= A[i]`.
Can we `increase` i?
Yes. Because when i is increased, j will be decreased.
So B[j-1] is decreased and A[i] is increased, and `B[j-1] <= A[i]` may
be satisfied.
Can we `decrease` i?
`No!` Because when i is decreased, j will be increased.
So B[j-1] is increased and A[i] is decreased, and B[j-1] <= A[i] will
be never satisfied.
So we must `increase` i. That is, we must ajust the searching range to
[i+1, imax]. So, set imin = i+1, and goto <2>.
<c> A[i-1] > B[j]
Means A[i-1] is too big. And we must `decrease` i to get `A[i-1]<=B[j]`.
That is, we must ajust the searching range to [imin, i-1].
So, set imax = i-1, and goto <2>.
如果目标值i已经找到,那么中值median就等于:
max(A[i-1], B[j-1]) (when m + n is odd)
or (max(A[i-1], B[j-1]) + min(A[i], B[j]))/2 (when m + n is even)
现在我们来处理边界问题:
<a> (j == 0 or i == m or B[j-1] <= A[i]) and
(i == 0 or j = n or A[i-1] <= B[j])
Means i is perfect, we can stop searching.
<b> j > 0 and i < m and B[j - 1] > A[i]
Means i is too small, we must increase it.
<c> i > 0 and j < n and A[i - 1] > B[j]
Means i is too big, we must decrease it.
如下就是参考代码:
class Solution {
public:
double findMedianSortedArrays(vector<int>& nums1, vector<int>& nums2) {
int m = nums1.size(), n = nums2.size();
vector<int> tmp(n, 0);
tmp.assign(nums2.begin(), nums2.end());
if (m > n) {
nums2.assign(nums1.begin(), nums1.end());
nums1.assign(tmp.begin(), tmp.end());
int tmp2 = m;
m = n;
n = tmp2;
}
else {
nums2.assign(tmp.begin(), tmp.end());
}
int lmin = 0, lmax = m, half = (m + n + 1) / 2;
while (lmin <= lmax) {
int i = (lmin + lmax) / 2;
int j = half - i;
if (i > 0 && j < n && nums1[i - 1] > nums2[j]) {
lmax = i - 1;
}
else if (j > 0 && i < m && nums2[j - 1] > nums1[i]) {
lmin = i + 1;
}
else {
int max_of_left = -1;
if (i == 0)
max_of_left = nums2[j - 1];
else if (j == 0)
max_of_left = nums1[i - 1];
else
max_of_left = max(nums1[i - 1], nums2[j - 1]);
if ((m + n) % 2 == 1)
return max_of_left;
int min_of_right = -1;
if (i == m)
min_of_right = nums2[j];
else if (j == n)
min_of_right = nums1[i];
else
min_of_right = min(nums2[j], nums1[i]);
return (max_of_left+ min_of_right) / 2.0;
}
}
return 0.0;
}
};