python标准库二分查找 bisect

bisect模块:

  • bisect(list,item,[low,[high]]):
    返回要插入item点的索引,如果item在列表中了,则返回该条目的右边索引
  • bisect_right(list,iten,[left,[right]]) :同上
  • bisect_left(list,item,[left,[right]])
    返回要插入item点的索引,如果item在列表中了,则返回该条目的左边索引
  • insort(list,item,[left,[right]])
    不返回索引,直接插入进去,如果有重复的item,则插入到右边
  • insort_right(list,item,[left,[right]]) :同上
  • insort_left(list,item,[left,[right]])
    不返回索引,直接插入进去,如果有重复的item,则插入到左边

源码

"""Bisection algorithms."""

def insort_right(a, x, lo=0, hi=None):
    """Insert item x in list a, and keep it sorted assuming a is sorted.

    If x is already in a, insert it to the right of the rightmost x.

    Optional args lo (default 0) and hi (default len(a)) bound the
    slice of a to be searched.
    """

    if lo < 0:
        raise ValueError('lo must be non-negative')
    if hi is None:
        hi = len(a)
    while lo < hi:
        mid = (lo+hi)//2
        if x < a[mid]: hi = mid
        else: lo = mid+1
    a.insert(lo, x)

insort = insort_right   # backward compatibility

def bisect_right(a, x, lo=0, hi=None):
    """Return the index where to insert item x in list a, assuming a is sorted.

    The return value i is such that all e in a[:i] have e <= x, and all e in
    a[i:] have e > x.  So if x already appears in the list, a.insert(x) will
    insert just after the rightmost x already there.

    Optional args lo (default 0) and hi (default len(a)) bound the
    slice of a to be searched.
    """

    if lo < 0:
        raise ValueError('lo must be non-negative')
    if hi is None:
        hi = len(a)
    while lo < hi:
        mid = (lo+hi)//2
        if x < a[mid]: hi = mid
        else: lo = mid+1
    return lo

bisect = bisect_right   # backward compatibility

def insort_left(a, x, lo=0, hi=None):
    """Insert item x in list a, and keep it sorted assuming a is sorted.

    If x is already in a, insert it to the left of the leftmost x.

    Optional args lo (default 0) and hi (default len(a)) bound the
    slice of a to be searched.
    """

    if lo < 0:
        raise ValueError('lo must be non-negative')
    if hi is None:
        hi = len(a)
    while lo < hi:
        mid = (lo+hi)//2
        if a[mid] < x: lo = mid+1
        else: hi = mid
    a.insert(lo, x)


def bisect_left(a, x, lo=0, hi=None):
    """Return the index where to insert item x in list a, assuming a is sorted.

    The return value i is such that all e in a[:i] have e < x, and all e in
    a[i:] have e >= x.  So if x already appears in the list, a.insert(x) will
    insert just before the leftmost x already there.

    Optional args lo (default 0) and hi (default len(a)) bound the
    slice of a to be searched.
    """

    if lo < 0:
        raise ValueError('lo must be non-negative')
    if hi is None:
        hi = len(a)
    while lo < hi:
        mid = (lo+hi)//2
        if a[mid] < x: lo = mid+1
        else: hi = mid
    return lo

# Overwrite above definitions with a fast C implementation
try:
    from _bisect import *
except ImportError:
    pass

示例

1. bisect - 二分查找

给定一个列表对象,我们要对目标元素进行查找,返回其在列表中的下标。

首先想到的是Python列表的index方法。建立一个长度为10000的升序列表,编写search函数使用index方式把里面的每一个元素查找一遍,平均运行时间437毫秒。
使用bisect模块的bisect_left,也就是我们熟知的二分查找。编写fast_search函数,平均运行时间3.94毫秒,性能提升了110倍!

# 查找列表中所有元素的索引
import bisect

def search(nums):
    for x in nums:
        nums.index(x)

def fast_search(nums):
    for x in nums:
        bisect.bisect_left(nums, x)

arr = list(range(20000))

t1 = time.time() * 1000
search(arr)
print('TIME CONSUMING FOR FUNC- [search]: {}ms'.format(time.time() * 1000 - t1))
t1 = time.time() * 1000
fast_search(arr)
print('TIME CONSUMING FOR FUNC- [fast_search]: {}ms'.format(time.time() * 1000 - t1))
# TIME CONSUMING FOR FUNC- [search]: 2698.0048828125ms
# TIME CONSUMING FOR FUNC- [fast_search]: 8.979248046875ms
  • 求最长上升子序列长度
    # 二分法
    def lengthOfLIS(nums):
        if not nums: return 0
        dp = []
        res = {}
        from bisect import bisect_left
        for item in nums:
            pos = bisect_left(dp, item)
            dp[pos:pos + 1] = [item]
        return len(dp)
    
    # 常规BP法
    def lengthOfLIS(nums):
        if not nums:
            return 0
        n = len(nums)
        dp = [1] * n
        for i in range(n):
            for j in range(i):
                if nums[i] > nums[j]:
                    dp[i] = max(dp[j] + 1, dp[i])
        res = max(dp)
        return res
    
2. 列表高效计数
from collections import Counter
def fast_count(nums):
    return Counter(nums)
3. 列表top-n

给定一个列表对象,返回该列表中最小的3个元素。

创建一个长度为10000的列表,对元素进行随机打乱。编写top_3函数,对列表进行排序,返回前3个元素。平均运行时间2.03毫秒。
使用heapq模块,也就是我们熟悉的堆,编写fast_top_3函数。平均运行时间296微秒,性能提升了6.8倍。

import heapq
from random import shuffle

def top_3(nums):
    return sorted(nums)[:3]

def fast_top_3(nums):
    return heapq.nsmallest(3, nums)

nums = list(range(100000))
shuffle(nums)

t1 = time.time() * 1000
top_3(nums)
print('TIME CONSUMING FOR FUNC- [top_3]: {}ms'.format(time.time() * 1000 - t1))
t1 = time.time() * 1000
fast_top_3(nums)
print('TIME CONSUMING FOR FUNC- [fast_top_3]: {}ms'.format(time.time() * 1000 - t1))
TIME CONSUMING FOR FUNC- [top_3]: 39.939208984375ms
TIME CONSUMING FOR FUNC- [fast_top_3]: 3.94384765625ms
4. itemgetter - 批量get元素

给定一个字典和一个列表,列表中包含一个或多个字典中的key,返回对应的values。

创建一个元素数量为10万的字典,从字典的key中随机抽样10万,形成一个长度为1万的列表。编写get_items函数,平均运行时间1.12毫秒
使用itemgetter批量读取这些元素,编写fast_get_items函数,平均运行时间836微秒,性能是原来的1.3倍。

from operator import itemgetter
from random import choices

def get_items(data, keys):
    return [data[x] for x in keys]

def fast_get_items(data, keys):
    return itemgetter(*keys)(data)

data= dict(enumerate(range(100000)))
keys = choices(list(data.keys()), k=10000)

t1 = time.time() * 1000
get_items(data, keys)
print('TIME CONSUMING FOR FUN- [get_items]: {}ms'.format(time.time() * 1000 - t1))
t1 = time.time() * 1000
fast_get_items(data, keys)
print('TIME CONSUMING FOR FUN- [fast_get_items]: {}ms'.format(time.time() * 1000 - t1))
# TIME CONSUMING FOR FUN- [top_3]: 2.03466796875ms
# TIME CONSUMING FOR FUN- [fast_top_3]: 0.0ms

[部分转载:别再说Python慢(标准库篇)]

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值