码迷,mamicode.com
首页 > 其他好文 > 详细

一个"Median Maintenance"问题

时间:2015-08-16 13:41:57      阅读:225      评论:0      收藏:0      [点我收藏+]

标签:

题目要求:

Download the text file here. (Right click and save link as).

The goal of this problem is to implement a variant of the 2-SUM algorithm (covered in the Week 6 lecture on hash table applications).

The file contains 1 million integers, both positive and negative (there might be some repetitions!).This is your array of integers, with the ith row of the file specifying the ith entry of the array.

Your task is to compute the number of target values t in the interval [-10000,10000] (inclusive) such that there are distinct numbers x,y in the input file that satisfy x+y=t. (NOTE: ensuring distinctness requires a one-line addition to the algorithm from lecture.)

Write your numeric answer (an integer between 0 and 20001) in the space provided.

OPTIONAL CHALLENGE: If this problem is too easy for you, try implementing your own hash table for it. For example, you could compare performance under the chaining and open addressing approaches to resolving collisions.

 

大致意思是说,有一百万个数字(正负都有,且可能重复)。目标是要寻找到两个数x和y,使得x+y属于[-10000, 10000]的范围中,且x和y在所有数当中都是唯一存在的,最后输出所有可能的和的个数。

文件中的数据差不多是这样子的:

...

6195
2303
5685
1354
4292
7600
6447
4479
9046
7293
5147
1260
1386
6193
4135
3611
8583

...


解题思路:

这道题当然可以采用最暴力的方法,即每次读入一个数后就对数组进行排序,然后记录中位数,但是显然应该还有更好的方法。没错,如果借用“堆”这一数据结构,可以让算法的时间复杂度大大降低。具体的思路如下:

  • 创建两个堆:最大堆和最小堆(最大堆即父节点大于子节点的堆,反之则是最小堆);
  • 每次读入一个数后,我们将它和最大堆与最小堆的根节点大小进行比较,如果大于最大堆的根节点,那么就把它插入到最小堆当中;反之,就插入最大堆当中。可以想象一下,通过这个操作,比这两个根节点大的数字都在最小堆的根节点之下,而比这两个根节点小的数字,都在最大堆的根节点之下;
  • 有了上述结论后,我们还不能保证中位数就在两个根节点中,因为两个堆的大小可能会差的很大,因此每次读入一个数并且插入相应的堆后,我们都要检查两个堆的大小,然后平衡他们的大小(只有在两个堆的大小差异不大于1的情况下, 中位数才是两个根节点中的一个)
  • 平衡的具体做法是:如果两个堆的大小差异超过了1,那么就把size较大的那个堆的根节点pop出来,并将其插入到size较小的堆中;
  • 最后就是计算中位数了,因为最小堆的根节点会大于最大堆的根节点,因此如果最小堆的size比最大堆大1,那么中位数就是最小堆根节点;如果两者大小相等,或者最大堆的size比最小堆大1,那么中位数就是最大堆的根节点。

代码实现:

有了上述的思路,利用C++对其进行了实现,代码如下:

#include <iostream>
#include <fstream>
#include <string>
#include <sstream>
#include <limits>
using namespace std;

class MinMaxHeap {
public:
    MinMaxHeap(bool is_min);
    ~MinMaxHeap();
    int Top();
    int Size();
    void Insert(int num);
    void Pop();
private:
    void swap(int index1, int index2);
    int size;
    int *element;
    bool is_min;
};

MinMaxHeap::MinMaxHeap(bool is_min = true) {
    // for this problem, 5010 is just fine
    this->element = new int[5010];
    this->size = 0;
    this->is_min = is_min;
}

MinMaxHeap::~MinMaxHeap() {
    delete[] this->element;
}

int MinMaxHeap::Top() {
    return this->element[0];
}

int MinMaxHeap::Size() {
    return this->size;
}

// The position of each element(the number means the index of the array)
//           0
//         /   //       1       2
//      /  \    /  //     3    4  5    6

void MinMaxHeap::Insert(int num) {
    int pos = size;
    element[size++] = num;
    if (is_min) {
        while (pos > 0) {
            int parent = (pos - 1) >> 1; // same as (pos - 1) / 2
            if (element[parent] <= element[pos]) {
                break;
            }
            swap(parent, pos);
            pos = parent;
        }
    }
    else {
        while (pos > 0) {
            int parent = (pos - 1) >> 1; // same as (pos - 1) / 2
            if (element[parent] >= element[pos]) {
                break;
            }
            swap(parent, pos);
            pos = parent;
        }
    }
}

void MinMaxHeap::Pop() {
    element[0] = element[--size];
    int pos = 0;

    if (is_min) {
        while (pos < (size >> 1)) // if pos >= (size / 2), then element[pos] must be a leaf
        {
            int left_child = pos * 2 + 1;
            int right_child = left_child + 1;
            int smallest_child;

            if (right_child < size && element[left_child] > element[right_child]) {
                smallest_child = right_child;
            }
            else {
                smallest_child = left_child;
            }

            if (element[pos] < element[smallest_child]) {
                break;
            }
                
            swap(pos, smallest_child);
            pos = smallest_child;
        }
    }
    else {
        while (pos < (size >> 1)) // if pos >= (size / 2), then element[pos] must be a leaf
        {
            int left_child = pos * 2 + 1;
            int right_child = left_child + 1;
            int biggest_child;

            if (right_child < size && element[left_child] < element[right_child]) {
                biggest_child = right_child;
            }
            else {
                biggest_child = left_child;
            }

            if (element[pos] > element[biggest_child]) {
                break;
            }

            swap(pos, biggest_child);
            pos = biggest_child;
        }
    }
}

void MinMaxHeap::swap(int index1, int index2) {
    int tmp = element[index1];
    element[index1] = element[index2];
    element[index2] = tmp;
}

int main() {
    ifstream fin;
    fin.open("Median.txt");

    MinMaxHeap MinHeap(true);
    MinMaxHeap MaxHeap(false);
    // because we want to find the median, so insert
    // both min of int and max of int is ok.
    MinHeap.Insert(numeric_limits<int>::max());
    MaxHeap.Insert(numeric_limits<int>::min());

    int input, sum = 0, min_top, max_top;
    string tmp;
    while (getline(fin, tmp)) {
        input = atoi(tmp.c_str());
        min_top = MinHeap.Top();
        max_top = MaxHeap.Top();
        if (input < max_top) {
            MaxHeap.Insert(input);
        }
        else {
            MinHeap.Insert(input);
        }
        // balance
        if (MaxHeap.Size() > MinHeap.Size() + 1) {
            max_top = MaxHeap.Top();
            MaxHeap.Pop();
            MinHeap.Insert(max_top);
        }
        if (MinHeap.Size() > MaxHeap.Size() + 1) {
            min_top = MinHeap.Top();
            MinHeap.Pop();
            MaxHeap.Insert(min_top);
        }
        //find the median
        if (MinHeap.Size() == MaxHeap.Size() + 1) {
            sum += MinHeap.Top();
        }
        else {
            sum += MaxHeap.Top();
        }
    }

    cout << sum % 10000 << endl;
    fin.close();
    system("pause");
    return 0;
}

 

通过这个方法,运算效率大大提升。

一个"Median Maintenance"问题

标签:

原文地址:http://www.cnblogs.com/jdneo/p/4734021.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!