295. Find Median from Data Stream

数据流中找中位数

最新推荐文章于 2025-09-12 18:08:49 发布

转载最新推荐文章于 2025-09-12 18:08:49 发布 · 81 阅读

0 ·

CC 4.0 BY-SA版权

原文链接：http://www.cnblogs.com/yrbbest/p/5044819.html

文章标签：

#java #数据结构与算法

本文介绍了一种使用最大堆和最小堆的数据结构解决在数据流中寻找中位数的问题。通过维护两个堆，可以确保在添加新元素后快速找到当前数据集的中位数。文章提供了Java实现示例，并讨论了不同方法的时间复杂度。

题目：

Median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value. So the median is the mean of the two middle value.

Examples:

[2,3,4] , the median is 3

[2,3], the median is (2 + 3) / 2 = 2.5

Design a data structure that supports the following two operations:

void addNum(int num) - Add a integer number from the data stream to the data structure.
double findMedian() - Return the median of all elements so far.

For example:

add(1)
add(2)
findMedian() -> 1.5
add(3) 
findMedian() -> 2

链接： http://leetcode.com/problems/find-median-from-data-stream/

题解：

在Data stream中找到median。这道题是Heap的经典应用，需要同时维护一个最大堆和一个最小堆，最大堆和最小堆的size <= 当前数字count / 2。在学习heap数据结构的时候一般都会讲到这一题，很经典。

Time Complexity: addNum - O(logn) ， findMedian - O(1)， Space Complexity - O(n)

class MedianFinder {
    private PriorityQueue<Integer> maxOrientedHeap;
    private PriorityQueue<Integer> minOrientedHeap;
    
    public MedianFinder() {
        this.minOrientedHeap = new PriorityQueue<Integer>();
        this.maxOrientedHeap = new PriorityQueue<Integer>(10, new Comparator<Integer>() {
                public int compare(Integer i1, Integer i2) {
                    return i2 - i1;
                }
            });
    }
    // Adds a number into the data structure.
    public void addNum(int num) {
        maxOrientedHeap.add(num);               // O(logn)
        minOrientedHeap.add(maxOrientedHeap.poll());               // O(logn)
        if(maxOrientedHeap.size() < minOrientedHeap.size()) {
            maxOrientedHeap.add(minOrientedHeap.poll());        //O(logn)
        }
    }

    // Returns the median of current data stream
    public double findMedian() {                    // O(1)
        if(maxOrientedHeap.size() == minOrientedHeap.size())
            return (maxOrientedHeap.peek() + minOrientedHeap.peek()) / 2.0;
        else
            return maxOrientedHeap.peek();
    }
};

// Your MedianFinder object will be instantiated and called as such:
// MedianFinder mf = new MedianFinder();
// mf.addNum(1);
// mf.findMedian();

二刷:

依然使用了两个PriorityQueue，一个maxPQ，一个minPQ。最小堆中存的是较大的一半数据，最大堆中存的是较小的一半数据。当传入数字计数为奇数是，我们直接返回minPQ.peek()，否则我们返回 (maxPQ.peek() + minPQ.peek()) / 2.0。这样做速度不是很理想，也许是maxPQ的lambda表达式构建出的comparator没有得到优化，换成普通的comparator速度至少快一倍。也可以用Collections.reverseOrder()来作为maxPQ的comparator。

见到有些朋友用了构建BST来做，这样减少了几次O(logn)的操作，速度会更快，以后在研究。

Java:

Min and Max Heap

Time Complexity: addNum - O(logn) ， findMedian - O(1)， Space Complexity - O(n)

class MedianFinder {
    Queue<Integer> minPQ = new PriorityQueue<>();
    Queue<Integer> maxPQ = new PriorityQueue<>(10, (Integer i1, Integer i2) -> i2 - i1);
    // Adds a number into the data structure.
    public void addNum(int num) {
        minPQ.offer(num);
        maxPQ.offer(minPQ.poll());
        if (minPQ.size() < maxPQ.size()) minPQ.offer(maxPQ.poll());
    }

    // Returns the median of current data stream
    public double findMedian() {
        if (minPQ.size() == maxPQ.size()) return (minPQ.peek() + maxPQ.peek()) / 2.0;
        return minPQ.peek();
    }
};

// Your MedianFinder object will be instantiated and called as such:
// MedianFinder mf = new MedianFinder();
// mf.addNum(1);
// mf.findMedian();