剑指Offer_编程题63：数据流中的中位数(堆)

最新推荐文章于 2021-09-12 15:42:53 发布

励志学好数据结构

最新推荐文章于 2021-09-12 15:42:53 发布

阅读量205

点赞数

CC 4.0 BY-SA版权

分类专栏：剑指offer系列

本文链接：https://blog.youkuaiyun.com/mengmengdajuanjuan/article/details/82690324

剑指offer系列专栏收录该内容

65 篇文章

订阅专栏

本文介绍了一种高效算法，用于实时计算数据流中的中位数。通过使用最大堆和最小堆，即使在数据不断流入的情况下，也能快速获得中位数，插入操作的时间复杂度为O(logn)，获取中位数的时间复杂度为O(1)。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

题目：如何得到一个数据流中的中位数？如果从数据流中读出奇数个数值，那么中位数就是所有数值排序之后位于中间的数值。如果从数据流中读出偶数个数值，那么中位数就是所有数值排序之后中间两个数的平均值。我们使用Insert()方法读取数据流，使用GetMedian()方法获取当前读取数据的中位数。

牛客网：链接

参考链接：点这里呀

思路：如果能够保证数据容器左边的数据都小于右边的数据，那么即使左、右两边内部的数据没有排序，也可以根据左边最大的数及右边最小的数得到中位数。如何快速从一个数据容器中找到最大数？用最大堆实现这个数据容器，因为位于堆顶的就是最大的数据。同样，也可以快速从最小堆中找出最小数。首先要保证数据平均分配到两个堆中，因此两个堆中数据的数目之差不能超过1。当目前两堆总数为偶数的时候，把数字存入最大堆，然后重排最大堆，如果最大堆的堆顶数字大于最小堆堆顶数字，则把两个堆顶数字交换，重排两堆，此时两堆数字总数为奇数，直接输出最大堆堆顶数字即为中位数；如果当前两堆总数为奇数的时候，把数字存为最小堆，重排最小堆，如果最大堆的堆顶数字大于最小堆堆顶数字，则把两个堆顶数字交换，重排两堆，此时两堆数字总数为偶数，取两堆堆顶数字做平均即为中位数。插入的时间复杂度O(logn)，得到中位数的时间复杂度是O(1)。

如果一个元素加入列表末尾，那么必须循环遍历调整创建最大堆或者最小堆！！如果只改变了列表的首元素，只调整首结点的最大堆就可以！！！

# -*- coding:utf-8 -*-
class Solution:
    def __init__(self):
        self.count = 0
        self.max_heap = []
        self.min_heap = []

    def Head_max_adjust(self, input_list, parent, length):
        temp = input_list[parent]
        child = 2 * parent + 1
        while child < length:
            if child + 1 < length and input_list[child+1] > input_list[child]:
                child += 1
            if temp >= input_list[child]:
                break
            input_list[parent] = input_list[child]
            parent = child
            child = 2 * parent + 1
        input_list[parent] = temp

    def Head_min_adjust(self, input_list, parent, length):
        temp = input_list[parent]
        child = 2 * parent + 1
        while child < length:
            if child + 1 < length and input_list[child+1] < input_list[child]:
                child += 1
            if temp <= input_list[child]:
                break
            input_list[parent] = input_list[child]
            parent = child
            child = 2 * parent + 1
        input_list[parent] = temp

    def Insert(self, num):
        # write code here
        self.count += 1
        if self.count & 0x1:
            self.max_heap.append(num)
            length = len(self.max_heap)
            '''构建最大堆必须遍历构建！！！'''
            for i in range(length//2)[::-1]:
                self.Head_max_adjust(self.max_heap, i, len(self.max_heap))
        else:
            self.min_heap.append(num)
            length = len(self.min_heap)
            for i in range(length//2)[::-1]:
                self.Head_min_adjust(self.min_heap, i, len(self.min_heap))
        while self.max_heap and self.min_heap and self.max_heap[0] > self.min_heap[0]:
            self.max_heap[0], self.min_heap[0] = self.min_heap[0], self.max_heap[0]
            '''调整最大堆只调整根节点！！！'''
            self.Head_max_adjust(self.max_heap, 0, len(self.max_heap))
            self.Head_min_adjust(self.min_heap, 0, len(self.min_heap))

    def GetMedian(self, nothing):
        # write code here
        if self.count & 0x1:
            return self.max_heap[0]
        else:
            # 除以2.0得到的才是小数
            return (self.max_heap[0]+self.min_heap[0])/2.0 

if __name__ == '__main__':
    a = Solution()
    a.Insert(5)
    a.Insert(2)
    a.Insert(3)
    a.Insert(4)
    a.Insert(1)
    a.Insert(6)
    a.Insert(7)
    a.Insert(0)
    a.Insert(8)  
    print(a.GetMedian())