AprioriTID algorithm

时间：2014-10-22 21:57:42 阅读：242 评论：0 收藏：0 [点我收藏+]

What is AprioriTID?

AprioriTID is an algorithm for discovering frequent itemsets (groups of items appearing frequently) in a transaction database. It was proposed by Agrawal & Srikant (1993).

AprioriTID is a variation of the Apriori algorithm. It was proposed in the same article as Apriori as an alternative implementation of Apriori. It produces the same output as Apriori. But it uses a different mechanism for counting the support of itemsets.

比较Apriori与AprioriTID如下：

数据结构方面：

在Apriori算法中，首先利用HashMap<Integer,Integer>存储每个项与其出现的次数之间的映射关系，取出频繁项构成List集合：frequent1. 将此List集合作为生成k=2时候选项的输入。　　

除了k=1外，其余k值的每个候选项存储在每个Itemset类的对象中，由List<Itemset>集合candidates统一存储。Itemset类中拥有存、取候选项，存储候选项支持度(support)的各函数。全部的频繁项集对象由List<Itemset>集合level存储。（level自然作为k>2时生成候选项函数的输入）

在AprioriTID算法中，用HashMap<Integer,Set<Integer>>存储每个项item与其出现的位置(transaction ID)之间的映射关系，从k=1时，直接将频繁项集存储在Itemset对象中(在对象中有集合存储TID)，并用List<Itemset>集合level存储各Itemset对象。Itemset类中增添了transaction ID集合，保存项集所对应的transaction ID。

在算法方面：

在AprioriTID算法中，当k>=2时，依旧通过we compare items of itemset1 and itemset2.If they have all the same k-1 items and the last item of itemset1 is smaller than the last item of itemset2, we will combine them to generate a candidate来生成候选项集。查看结合在一起的候选集的共同的tid(common tids)，当common tids中元素个数满足minsup则结合在一起的候选集为频繁项，(相比apriori效率提高了一些，apriori是将候选项不断与transaction作比较，计算各候选项支持度)保存频繁项和其对应的common tids到Itemset对象中，统一由List<Itemset>集合candidates存储，通过saveItemset()函数保存频繁项集之后，candidates作为下一次计算k+1时频繁项的输入。　　

AprioriTID algorithm

标签：io os ar for strong sp 数据 on art

原文地址：http://www.cnblogs.com/cnblogs-learn/p/4044381.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行