Binary Search Comprehension

Binary Search is a popular topic in coding assessments. However, there are tons of variations of binary search, and it is sometimes really confusing. Today, I want to clarify several variations of binary search and their underlying principles.

Closed interval vs Left closed right open interval

p.s. the code can be verified through LeetCode.

There are mainly two kinds of styles to implement binary search. Closed interval, i.e. [ l e , r i ] [le, ri] [le,ri] and left closed right open interval, i.e. [ l e , r i ) [le, ri) [le,ri).

For closed interval, a classic Python code of binary search is

def search(self, nums, target):
    le = 0
    ri = len(nums)-1
    mid = 0
    while le <= ri:
        mid = (le+ri)//2
        if nums[mid] == target:
            return mid
        if nums[mid] < target:
            le = mid+1
        else:
            ri = mid-1
    return -1

There are 3 possible cases regarding the relationship between middle value and target. If the middle value is equal to the target, we’ve found the target and it’s time to break the loop. If the middle value is less than target, the target must be in the right half of current interval. Since the interval is closed and the middle value should not be included in the interval, we set the le to mid+1 to exclude middle value and just keep the right part. If the middle value is larger than target, the process is symmetric to less than and we will set ri to mid-1.

How about left closed right open version? Technically, this version just replaces the value of ri with ri+1. We can do some modification according to the new value.
For ri = len(nums)-1, it should be rewritten to ri = len(nums). (obvious)
For le <= ri, it should be rewritten to le < ri. (obvious)
For ri = mid-1, it should be rewritten to ri = mid, because now right value itself is also excluded by the interval.

Now, the left closed right open interval binary search works properly.

def search(self, nums, target):
    le = 0
    ri = len(nums)
    mid = 0
    while le < ri:
        mid = (le+ri)//2
        if nums[mid] == target:
            return mid
        if nums[mid] < target:
            le = mid+1
        else:
            ri = mid
    return -1

Is there something missed? Yeah! We did not rewrite the sentence mid = (le+ri)//2 to mid = (le+ri-1)//2 (though the results of binary search will also definitely works, you can try it on LeetCode). That’s because both (le+ri)//2 and (le+ri-1)//2 are located in the middle of interval [ l e , r i ) [le, ri) [le,ri) (approximately). When le+ri is odd, they are the same; when le+ri is even, (le+ri)//2 = (le+ri-1)//2 + 1. For example, when le=2, ri=4, (le+ri)//2=3 and (le+ri-1)//2=2. They all fall into the interval [ l e , r i ) [le, ri) [le,ri).

Floor vs Ceiling

After discussing the replacement of ri and ri+1, here comes another question. When we are computing mid, both of above compute it by mid=(le+ri)//2, which means we floor the result. Can we apply ceiling to the computation? Of course! The code below also works:

class Solution(object):
    def search(self, nums, target):
        le = 0
        ri = len(nums)-1
        mid = 0
        while le <= ri:
            mid = (le+ri+1)//2
            if nums[mid] == target:
                return mid
            if nums[mid] < target:
                le = mid+1
            else:
                ri = mid-1
        return -1

Whether mid is floored or ceiled, it still falls into the interval [ l e , r i ] [le, ri] [le,ri]. However, for interval [ l e , r i ) [le, ri) [le,ri), ceiling is not appliable, because r i ri ri is not in the interval.

The key principle is: mid have to fall into the location, and the endpoints of the interval shall move the exclude the last mid.

Therefore, comparing closed interval and left closed right open interval binary search, we can take left closed right open interval binary search as the ceiling version of closed interval one.

So, in the blog below, we will only discuss binary search in closed interval.

Lower Bound & Upper Bound

All the codes above are only able to search the position of an exact value, but what if we want to find first number that greater or equal to x (lower bound) or first number that greater than x (upper bound)?

Take lower bound as an example. To figure it out, let’s think step by step.

  1. If nums[mid] < target, then numbers less or equal to nums[mid], or say prior to mid, will be discarded, so we set le = mid+1.
  2. If nums[mid] >= target, then the numbers greater or equal to nums[mid], or say after mid (and mid itself) will be discarded, so we set ri = mid-1 and save current mid as a potential answer.
def lower_bound(nums, target):
    le = 0
    ri = len(nums)-1
    pos = -1
    while le <= ri:
        mid = (le+ri)//2
        if nums[mid] < target:
            le = mid+1
        else:
            pos = mid
            ri = mid-1
    return pos

If we want to perform an upper bound, all we need to do is to modify the condition.

def upper_bound(nums, target):
    le = 0
    ri = len(nums)-1
    pos = -1
    while le <= ri:
        mid = (le+ri)//2
        if nums[mid] <= target:
            le = mid+1
        else:
            pos = mid
            ri = mid-1
    return pos

General binary search

So, let’s look at all these binary searches and find a general structure!

def general_binary_search(nums, target):
	le = 0
	ri = len(nums)-1
	pos = -1
	while le <= ri:
		mid = (le+ri)//2
		if target seems appear in the right part of the interval (or just mid):
			(pos = mid)
			le = mid+1
		else: (target seems appear in the left part of the interval or just mid)
			(pos = mid)
			ri = mid-1

The condition describes the monotony of the sequence, depending on the specific property of the sequence.

Where we record the answer describes our behavior when there are multiple valid values. Record when moving left endpoint means search the right part, trying to find a valid value with greatest index; and record when moving right endpoint means search the left part, trying to find a valid value with least index.

Now, guess what the function of the code below is:

def bs3(nums, target):
    le = 0
    ri = len(nums)-1
    pos = -1
    while le <= ri:
        mid = (le+ri)//2
        if nums[mid] <= target:
            pos = mid
            le = mid+1
        else:
            ri = mid-1
    return pos

Because we record the answer under the condition nums[mid[ <= target and move left endpoint afterwards, the function is get number less or equal to target with greatest index.

What about change nums[mid] <= target into nums[mid] < target? We will get the number less or than target with greatest index.

def bs4(nums, target):
    le = 0
    ri = len(nums)-1
    pos = -1
    while le <= ri:
        mid = (le+ri)//2
        if nums[mid] < target:
            pos = mid
            le = mid+1
        else:
            ri = mid-1
    return pos

END

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

ShadyPi

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值