I wrote a compiler cache for MSVC (much like ccache for gcc). One of the things I have to do is to remove the oldest object files in my cache directory to trim the cache to a user-defined size.
Right now, I basically have a list of tuples, each of which is the last access time and the file size:
# First tuple element is the access time, second tuple element is file size
items = [ (1, 42341),
(3, 22),
(0, 3234),
(2, 42342),
(4, 123) ]
Now I'd like to do a partial sort on this list so that the first N elements are sorted (where N is the number of elements so that the sum of their sizes exceeds 45000). The result should be basically this:
# Partially sorted list; only first two elements are sorted because the sum of
# their second field is larger than 45000.
items = [ (0, 3234),
(1, 42341),
(3, 22),
(2, 42342),
(4, 123) ]
I don't really care about the order of the unsorted entries, I just need the N oldest items in the list whose cumulative size exceeds a certain value.
解决方案
You could use the heapq module. Call heapify() on the list, followed by heappop() until your condition is met. heapify() is linear and heappop() logarithmic, so it's likely as fast as you can get.
heapq.heapify(items)
size = 0
while items and size < 45000:
item = heapq.heappop(items)
size += item[1]
print item
Output:
(0, 3234)
(1, 42341)
该博客介绍了一个为MSVC编译器创建的缓存系统,类似于gcc的ccache。作者面临的问题是如何根据用户定义的大小限制,删除最旧的 object 文件。解决方案是利用heapq模块,通过heapify和heappop方法,部分排序列表,保留累计大小超过45000的前N个元素。这种方法实现了高效的缓存修剪,保持了缓存的有效性和性能。
1905

被折叠的 条评论
为什么被折叠?



