python标准库二分查找 bisect

最新推荐文章于 2025-05-10 18:39:13 发布

煲饭酱

最新推荐文章于 2025-05-10 18:39:13 发布

阅读量529

点赞数

分类专栏： python模块 python基础

本文链接：https://blog.youkuaiyun.com/weixin_40040404/article/details/115057306

版权

python基础同时被 2 个专栏收录

16 篇文章

订阅专栏

python模块

3 篇文章

订阅专栏

bisect模块：

bisect(list,item,[low,[high]]):
返回要插入item点的索引，如果item在列表中了，则返回该条目的右边索引
bisect_right(list,iten,[left,[right]]) ：同上
bisect_left(list,item,[left,[right]])
返回要插入item点的索引，如果item在列表中了，则返回该条目的左边索引
insort(list,item,[left,[right]])
不返回索引，直接插入进去，如果有重复的item，则插入到右边
insort_right(list,item,[left,[right]]) ：同上
insort_left(list,item,[left,[right]])
不返回索引，直接插入进去，如果有重复的item，则插入到左边

源码

"""Bisection algorithms."""

def insort_right(a, x, lo=0, hi=None):
    """Insert item x in list a, and keep it sorted assuming a is sorted.

    If x is already in a, insert it to the right of the rightmost x.

    Optional args lo (default 0) and hi (default len(a)) bound the
    slice of a to be searched.
    """

    if lo < 0:
        raise ValueError('lo must be non-negative')
    if hi is None:
        hi = len(a)
    while lo < hi:
        mid = (lo+hi)//2
        if x < a[mid]: hi = mid
        else: lo = mid+1
    a.insert(lo, x)

insort = insort_right   # backward compatibility

def bisect_right(a, x, lo=0, hi=None):
    """Return the index where to insert item x in list a, assuming a is sorted.

    The return value i is such that all e in a[:i] have e <= x, and all e in
    a[i:] have e > x.  So if x already appears in the list, a.insert(x) will
    insert just after the rightmost x already there.

    Optional args lo (default 0) and hi (default len(a)) bound the
    slice of a to be searched.
    """

    if lo < 0:
        raise ValueError('lo must be non-negative')
    if hi is None:
        hi = len(a)
    while lo < hi:
        mid = (lo+hi)//2
        if x < a[mid]: hi = mid
        else: lo = mid+1
    return lo

bisect = bisect_right   # backward compatibility

def insort_left(a, x, lo=0, hi=None):
    """Insert item x in list a, and keep it sorted assuming a is sorted.

    If x is already in a, insert it to the left of the leftmost x.

    Optional args lo (default 0) and hi (default len(a)) bound the
    slice of a to be searched.
    """

    if lo < 0:
        raise ValueError('lo must be non-negative')
    if hi is None:
        hi = len(a)
    while lo < hi:
        mid = (lo+hi)//2
        if a[mid] < x: lo = mid+1
        else: hi = mid
    a.insert(lo, x)


def bisect_left(a, x, lo=0, hi=None):
    """Return the index where to insert item x in list a, assuming a is sorted.

    The return value i is such that all e in a[:i] have e < x, and all e in
    a[i:] have e >= x.  So if x already appears in the list, a.insert(x) will
    insert just before the leftmost x already there.

    Optional args lo (default 0) and hi (default len(a)) bound the
    slice of a to be searched.
    """

    if lo < 0:
        raise ValueError('lo must be non-negative')
    if hi is None:
        hi = len(a)
    while lo < hi:
        mid = (lo+hi)//2
        if a[mid] < x: lo = mid+1
        else: hi = mid
    return lo

# Overwrite above definitions with a fast C implementation
try:
    from _bisect import *
except ImportError:
    pass

示例

1. bisect - 二分查找

给定一个列表对象，我们要对目标元素进行查找，返回其在列表中的下标。

首先想到的是Python列表的index方法。建立一个长度为10000的升序列表，编写search函数使用index方式把里面的每一个元素查找一遍，平均运行时间437毫秒。
使用bisect模块的bisect_left，也就是我们熟知的二分查找。编写fast_search函数，平均运行时间3.94毫秒，性能提升了110倍！

# 查找列表中所有元素的索引
import bisect

def search(nums):
    for x in nums:
        nums.index(x)

def fast_search(nums):
    for x in nums:
        bisect.bisect_left(nums, x)

arr = list(range(20000))

t1 = time.time() * 1000
search(arr)
print('TIME CONSUMING FOR FUNC- [search]: {}ms'.format(time.time() * 1000 - t1))
t1 = time.time() * 1000
fast_search(arr)
print('TIME CONSUMING FOR FUNC- [fast_search]: {}ms'.format(time.time() * 1000 - t1))
# TIME CONSUMING FOR FUNC- [search]: 2698.0048828125ms
# TIME CONSUMING FOR FUNC- [fast_search]: 8.979248046875ms

求最长上升子序列长度

# 二分法
def lengthOfLIS(nums):
    if not nums: return 0
    dp = []
    res = {}
    from bisect import bisect_left
    for item in nums:
        pos = bisect_left(dp, item)
        dp[pos:pos + 1] = [item]
    return len(dp)

# 常规BP法
def lengthOfLIS(nums):
    if not nums:
        return 0
    n = len(nums)
    dp = [1] * n
    for i in range(n):
        for j in range(i):
            if nums[i] > nums[j]:
                dp[i] = max(dp[j] + 1, dp[i])
    res = max(dp)
    return res

2. 列表高效计数

from collections import Counter
def fast_count(nums):
    return Counter(nums)

3. 列表top-n

给定一个列表对象，返回该列表中最小的3个元素。

创建一个长度为10000的列表，对元素进行随机打乱。编写top_3函数，对列表进行排序，返回前3个元素。平均运行时间2.03毫秒。
使用heapq模块，也就是我们熟悉的堆，编写fast_top_3函数。平均运行时间296微秒，性能提升了6.8倍。

import heapq
from random import shuffle

def top_3(nums):
    return sorted(nums)[:3]

def fast_top_3(nums):
    return heapq.nsmallest(3, nums)

nums = list(range(100000))
shuffle(nums)

t1 = time.time() * 1000
top_3(nums)
print('TIME CONSUMING FOR FUNC- [top_3]: {}ms'.format(time.time() * 1000 - t1))
t1 = time.time() * 1000
fast_top_3(nums)
print('TIME CONSUMING FOR FUNC- [fast_top_3]: {}ms'.format(time.time() * 1000 - t1))
TIME CONSUMING FOR FUNC- [top_3]: 39.939208984375ms
TIME CONSUMING FOR FUNC- [fast_top_3]: 3.94384765625ms

4. itemgetter - 批量get元素

给定一个字典和一个列表，列表中包含一个或多个字典中的key，返回对应的values。

创建一个元素数量为10万的字典，从字典的key中随机抽样10万，形成一个长度为1万的列表。编写get_items函数，平均运行时间1.12毫秒
使用itemgetter批量读取这些元素，编写fast_get_items函数，平均运行时间836微秒，性能是原来的1.3倍。

from operator import itemgetter
from random import choices

def get_items(data, keys):
    return [data[x] for x in keys]

def fast_get_items(data, keys):
    return itemgetter(*keys)(data)

data= dict(enumerate(range(100000)))
keys = choices(list(data.keys()), k=10000)

t1 = time.time() * 1000
get_items(data, keys)
print('TIME CONSUMING FOR FUN- [get_items]: {}ms'.format(time.time() * 1000 - t1))
t1 = time.time() * 1000
fast_get_items(data, keys)
print('TIME CONSUMING FOR FUN- [fast_get_items]: {}ms'.format(time.time() * 1000 - t1))
# TIME CONSUMING FOR FUN- [top_3]: 2.03466796875ms
# TIME CONSUMING FOR FUN- [fast_top_3]: 0.0ms

[部分转载：别再说Python慢(标准库篇)]