bisect模块:
- bisect(list,item,[low,[high]]):
返回要插入item点的索引,如果item在列表中了,则返回该条目的右边索引 - bisect_right(list,iten,[left,[right]]) :同上
bisect_left(list,item,[left,[right]])
返回要插入item点的索引,如果item在列表中了,则返回该条目的左边索引insort(list,item,[left,[right]])
不返回索引,直接插入进去,如果有重复的item,则插入到右边- insort_right(list,item,[left,[right]]) :同上
- insort_left(list,item,[left,[right]])
不返回索引,直接插入进去,如果有重复的item,则插入到左边
源码
"""Bisection algorithms."""
def insort_right(a, x, lo=0, hi=None):
"""Insert item x in list a, and keep it sorted assuming a is sorted.
If x is already in a, insert it to the right of the rightmost x.
Optional args lo (default 0) and hi (default len(a)) bound the
slice of a to be searched.
"""
if lo < 0:
raise ValueError('lo must be non-negative')
if hi is None:
hi = len(a)
while lo < hi:
mid = (lo+hi)//2
if x < a[mid]: hi = mid
else: lo = mid+1
a.insert(lo, x)
insort = insort_right # backward compatibility
def bisect_right(a, x, lo=0, hi=None):
"""Return the index where to insert item x in list a, assuming a is sorted.
The return value i is such that all e in a[:i] have e <= x, and all e in
a[i:] have e > x. So if x already appears in the list, a.insert(x) will
insert just after the rightmost x already there.
Optional args lo (default 0) and hi (default len(a)) bound the
slice of a to be searched.
"""
if lo < 0:
raise ValueError('lo must be non-negative')
if hi is None:
hi = len(a)
while lo < hi:
mid = (lo+hi)//2
if x < a[mid]: hi = mid
else: lo = mid+1
return lo
bisect = bisect_right # backward compatibility
def insort_left(a, x, lo=0, hi=None):
"""Insert item x in list a, and keep it sorted assuming a is sorted.
If x is already in a, insert it to the left of the leftmost x.
Optional args lo (default 0) and hi (default len(a)) bound the
slice of a to be searched.
"""
if lo < 0:
raise ValueError('lo must be non-negative')
if hi is None:
hi = len(a)
while lo < hi:
mid = (lo+hi)//2
if a[mid] < x: lo = mid+1
else: hi = mid
a.insert(lo, x)
def bisect_left(a, x, lo=0, hi=None):
"""Return the index where to insert item x in list a, assuming a is sorted.
The return value i is such that all e in a[:i] have e < x, and all e in
a[i:] have e >= x. So if x already appears in the list, a.insert(x) will
insert just before the leftmost x already there.
Optional args lo (default 0) and hi (default len(a)) bound the
slice of a to be searched.
"""
if lo < 0:
raise ValueError('lo must be non-negative')
if hi is None:
hi = len(a)
while lo < hi:
mid = (lo+hi)//2
if a[mid] < x: lo = mid+1
else: hi = mid
return lo
# Overwrite above definitions with a fast C implementation
try:
from _bisect import *
except ImportError:
pass
示例
1. bisect - 二分查找
给定一个列表对象,我们要对目标元素进行查找,返回其在列表中的下标。
首先想到的是Python列表的index方法。建立一个长度为10000的升序列表,编写search函数使用index方式把里面的每一个元素查找一遍,平均运行时间437毫秒。
使用bisect模块的bisect_left,也就是我们熟知的二分查找。编写fast_search函数,平均运行时间3.94毫秒,性能提升了110倍!
# 查找列表中所有元素的索引
import bisect
def search(nums):
for x in nums:
nums.index(x)
def fast_search(nums):
for x in nums:
bisect.bisect_left(nums, x)
arr = list(range(20000))
t1 = time.time() * 1000
search(arr)
print('TIME CONSUMING FOR FUNC- [search]: {}ms'.format(time.time() * 1000 - t1))
t1 = time.time() * 1000
fast_search(arr)
print('TIME CONSUMING FOR FUNC- [fast_search]: {}ms'.format(time.time() * 1000 - t1))
# TIME CONSUMING FOR FUNC- [search]: 2698.0048828125ms
# TIME CONSUMING FOR FUNC- [fast_search]: 8.979248046875ms
- 求最长上升子序列长度
# 二分法 def lengthOfLIS(nums): if not nums: return 0 dp = [] res = {} from bisect import bisect_left for item in nums: pos = bisect_left(dp, item) dp[pos:pos + 1] = [item] return len(dp) # 常规BP法 def lengthOfLIS(nums): if not nums: return 0 n = len(nums) dp = [1] * n for i in range(n): for j in range(i): if nums[i] > nums[j]: dp[i] = max(dp[j] + 1, dp[i]) res = max(dp) return res
2. 列表高效计数
from collections import Counter
def fast_count(nums):
return Counter(nums)
3. 列表top-n
给定一个列表对象,返回该列表中最小的3个元素。
创建一个长度为10000的列表,对元素进行随机打乱。编写top_3函数,对列表进行排序,返回前3个元素。平均运行时间2.03毫秒。
使用heapq模块,也就是我们熟悉的堆,编写fast_top_3函数。平均运行时间296微秒,性能提升了6.8倍。
import heapq
from random import shuffle
def top_3(nums):
return sorted(nums)[:3]
def fast_top_3(nums):
return heapq.nsmallest(3, nums)
nums = list(range(100000))
shuffle(nums)
t1 = time.time() * 1000
top_3(nums)
print('TIME CONSUMING FOR FUNC- [top_3]: {}ms'.format(time.time() * 1000 - t1))
t1 = time.time() * 1000
fast_top_3(nums)
print('TIME CONSUMING FOR FUNC- [fast_top_3]: {}ms'.format(time.time() * 1000 - t1))
TIME CONSUMING FOR FUNC- [top_3]: 39.939208984375ms
TIME CONSUMING FOR FUNC- [fast_top_3]: 3.94384765625ms
4. itemgetter - 批量get元素
给定一个字典和一个列表,列表中包含一个或多个字典中的key,返回对应的values。
创建一个元素数量为10万的字典,从字典的key中随机抽样10万,形成一个长度为1万的列表。编写get_items函数,平均运行时间1.12毫秒
使用itemgetter批量读取这些元素,编写fast_get_items函数,平均运行时间836微秒,性能是原来的1.3倍。
from operator import itemgetter
from random import choices
def get_items(data, keys):
return [data[x] for x in keys]
def fast_get_items(data, keys):
return itemgetter(*keys)(data)
data= dict(enumerate(range(100000)))
keys = choices(list(data.keys()), k=10000)
t1 = time.time() * 1000
get_items(data, keys)
print('TIME CONSUMING FOR FUN- [get_items]: {}ms'.format(time.time() * 1000 - t1))
t1 = time.time() * 1000
fast_get_items(data, keys)
print('TIME CONSUMING FOR FUN- [fast_get_items]: {}ms'.format(time.time() * 1000 - t1))
# TIME CONSUMING FOR FUN- [top_3]: 2.03466796875ms
# TIME CONSUMING FOR FUN- [fast_top_3]: 0.0ms
[部分转载:别再说Python慢(标准库篇)]