vector数据查找方法

时间：2014-10-27 21:21:52 阅读：295 评论：0 收藏：0 [点我收藏+]

标签：stl find hash_map binsearch 折半查找

用STL编写程序时经常使用vector容器来存储数据，当容器中的数据有序时我们可以采取两种方式：

(1) 利用<algorithm>中的find函数进行查找；

(2) 折半查找。

另外也可以将数据存入hash_map中进行查找，下面来测试比较这两种方法的时间效率。

1. 测试数据集

生成比99999小的所有素数作为查询数据集，查找2到99999之间的所有数。

令数组A存储2~99999之间的所有数，则生成素数的方式

(1) 找到当前最小的数字min；

(2) 然后删除min的所有倍数。

重复这两个过程直到A中所有的数字处理完毕，即找到了2~99999之间的所有素数。

2. 效率比较

利用find函数查找需要2745ms，利用折半与hash_map均只需要0ms。

当数字增加到999999时，折半耗时63ms，hash_map耗时31ms。

当数字增加到9999999时，折半耗时577ms，hash_map耗时499ms。

注：hash_map中无法初始化桶的个数会降低hash的速度。（欢迎大家告知如何初始化）

3. 分析

实际遇到的问题：在处理大规模图数据的过程中遇到了vector能存储完所有的图数据，而hash_map却不能。即vector存储的数据规模比hash_map大。

折半查找只能用于有序的数据的查找，而find无要求。

4. 参考代码

#include <string>
#include <sstream>
#include <time.h>
#include <algorithm>
#include <vector>
#include <iostream>
#include <hash_map>
using namespace std;


class compare
{
	vector<int> dataVector;
	vector<int> findData;
	hash_map<int, int> dataHash;
public:
	compare();
	~compare(void);
	void generalPrime();
	void findTest();
	void binSearch();
	void hashTest();
};

compare::compare()
{
	generalPrime();
}


compare::~compare(void)
{
	findData.clear();
	dataVector.clear();
}

void compare::findTest()
{
	clock_t startTime = clock();
	vector<int>::iterator result;
	int exist = 0;
	for (vector<int>::iterator it = findData.begin(); it < findData.end(); it++)
	{
		result = find(dataVector.begin(), dataVector.end(), *it);
		if (result != dataVector.end())
		{
			//查找成功
			exist++;
		}
	}
	clock_t endTime = clock();
	cout << "exist num: " << exist << " find time " << (double)(endTime - startTime)/CLOCKS_PER_SEC*1000 << "ms" <<endl;
}

void compare::binSearch()
{
	int start;
	int end;
	int middle;
	int exist = 0;
	clock_t startTime = clock();
	for (vector<int>::iterator it = findData.begin(); it < findData.end(); it++)
	{
		start = 0;
		end = dataVector.size() - 1;
		middle = (start + end) / 2;
		while (start <= end)
		{
			if (*it < dataVector[middle])
			{
				end = middle - 1;
			}
			else if (*it > dataVector[middle])
			{
				start = middle + 1;
			}
			else
			{
				break;
			}
			middle = (start + end) / 2;
		}
		if (start <= end)
		{
			exist++;
		}
	}
	clock_t endTime = clock();
	cout << "exist num: " << exist << " binsearch time: " << (double)(endTime - startTime)/CLOCKS_PER_SEC * 1000 << "ms" << endl;
}

void compare::generalPrime()
{
	int maxPrime = 99999;
	int flag;
	vector<bool> visited(maxPrime, true);
	for (int i = 2; i < maxPrime; ++i)
	{
		findData.push_back(i);
		if (visited[i])
		{
			dataVector.push_back(i);
			dataHash[i] = 1;
			flag = i;
			for (int ii = 2, flag = i * ii; flag < maxPrime; ++ii, flag *= ii)
			{
				visited[flag] = false;
			}
		}
	}
}

void compare::hashTest()
{
	clock_t startTime = clock();
	int exist = 0;
	vector<int>::iterator result;
	for (vector<int>::iterator it = findData.begin(); it < findData.end(); it++)
	{
		if (dataHash.find(*it) != dataHash.end())
		{
			exist++;
		}
	}
	clock_t endTime = clock();
	cout << "exist num: " << exist << " hash time " << (double)(endTime - startTime)/CLOCKS_PER_SEC*1000 << "ms" << endl;
}
int main()
{
	compare com;
	com.findTest();
	com.binSearch();
	com.hashTest();
	return 1;
}

vector数据查找方法

标签：stl find hash_map binsearch 折半查找

原文地址：http://blog.csdn.net/woniu317/article/details/40513589

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行