leetcode笔记：Find Median from Data Stream

时间：2016-01-15 00:00:16 阅读：414 评论：0 收藏：0 [点我收藏+]

标签：

一. 题目描述

Median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value. So the median is the mean of the two middle value.

Examples:

[2,3,4] , the median is 3

[2,3], the median is (2 + 3) / 2 = 2.5

Design a data structure that supports the following two operations:

void addNum(int num) - Add a integer number from the data stream to the data structure.
double findMedian() - Return the median of all elements so far.

For example:

add(1)
add(2)
findMedian() -> 1.5
add(3) 
findMedian() -> 2

二. 题目分析

题目大意是，在一串输入的数据流中寻找中位数。所谓中位数，就是指有序整数列表的中间值。如果列表的长度为偶数，此时没有中间值。则其中位数就是两个中间值的平均值。

例如：

[2,3,4], 中位数是 3

[2,3], 中位数是 (2 + 3) / 2 = 2.5

设计一种数据结构，支持下面两种操作：

void addNum(int num) // 该函数从数据流向数据结构加入一个整数
double findMedian() // 返回截至目前所有元素的中位数

该题的经典做法是，维护一个最大堆和一个最小堆。最大堆存的是截至目前为止较小的那一半数，最小堆存放的是截至目前为止较大的那一半数，这样中位数只有可能是堆顶或者两个堆顶所对应两个数的平均值。

维护两个堆的技巧在于判断堆顶端的数和新插入的数的大小关系，另外，由于两个堆平分了所有数，因此还需要考虑两者的大小关系。这里我们规定两个堆的大小之差不超过1。先判断堆顶数和新数的大小关系，有以下三种情况：

最小堆堆顶小于新插入的数时，说明新插入的数处在所有数的上半部分；
最大堆堆顶大于新插入的数时，说明新插入的数处在所有数的下半部分；
最小堆堆顶大于新插入的数时，但最大堆堆顶小于新插入的数时，说明新插入的数将处在最小堆堆顶或最大堆堆顶，也就是在中间的位置。

再判断两个堆的大小关系，如果新插入的数属于前两种情况，开始插入目标堆，此时又有两种操作：

若目标堆不大于另一个堆时，将新数插入目标堆；
若目标堆大于另一个堆时，将目标堆的堆顶先移动到另一个堆，再把新数插入目标堆。

如果新插入的数属于第三种情况，即在中间位置，则插入到大小较小的那个堆即可。这样，每次新加进来一个数以后，若两个堆一样大，则中位数是两个堆顶的平均值，否则较大的那个堆的堆顶为中位数。

建立两个堆所用的代码比较长，而使用优先队列来实现则简单许多。

priority_queue：优先队列，是一个拥有权值概念的单向队列，在这个队列中，所有元素是按优先级排列的。优先队列有两种，一种是最大优先队列；一种是最小优先队列；每次取自队列的第一个元素分别是优先级最大和优先级最小的元素。

实际使用时，加入头文件："queue.h", "functional.h"

其中”functional.h”定义了优先级。（若要自己定义优先级可以不加）关于优先队列的使用可参照：

http://www.cnblogs.com/summerRQ/articles/2470130.html
http://blog.csdn.net/zhang20072844/article/details/10286997

三. 示例代码

class MedianFinder 
{
private: 
    priority_queue<int,std::vector<int>, std::greater<int>> q1; // 数据越小，优先级越高
    priority_queue<int> q2; // 数据越大，优先级越高

public:

    void addNum(int num) 
    { 
        if(q2.empty())
        {
            q2.push(num);
            return;
        }
        if(num <= q2.top())
        {
            if(q2.size() <= q1.size()) q2.push(num);
            else
            {
                q1.push(q2.top());
                q2.pop();
                q2.push(num);
            }
        } 
        else
        {
            if(q2.size() <= q1.size())
            {
                if(num <= q1.top()) q2.push(num);
                else
                {
                    q2.push(q1.top());
                    q1.pop();
                    q1.push(num);
                }
            }
            else
            {
                q1.push(num);
            }
        }
    }

    double findMedian() 
    {
        if(q1.size() == q2.size()) return (q1.top() + q2.top()) / 2.0;
        return double(q2.top());
    }
};

// Your MedianFinder object will be instantiated and called as such:
// MedianFinder mf;
// mf.addNum(1);
// mf.findMedian();

四. 小结

重新认真学了一下优先队列，受益匪浅。

leetcode笔记：Find Median from Data Stream

标签：

原文地址：http://blog.csdn.net/liyuefeilong/article/details/50520464

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行