并查集：Union-Find（１）

时间：2015-06-04 22:49:32 阅读：207 评论：0 收藏：0 [点我收藏+]

Disjoint Sets：
　　我们都知道Sets（集合）是什么，就是一组非重复元素组成的结构。
　　先让我们来看一下Disjoint Sets（非相交集合）：
　　Disjoint Sets的意思是一堆集合们，它们相互之间都没有交集。没有交集是指：各个集合之间没有拥有共同、相同的元素。中文称作「分离集」。
　　Disjoint Sets 的性质相当特殊。信息学家仔细观察其特性后，精心设计出一套优雅美观的资料结构，可以快速的做集合运算。
　　由于每个 Disjoint Sets 指的就是集合们都没有交集，我们就不用考虑交集、差集等等的运算，因为结果很明显。所以只需要考虑 union 、 find 、 split 这三个集合运算：
　　union 就是将两个集合做联集，合并成一个集合。 find 就是找找看一个元素是在哪个集合里面。 split 就是把一个元素从一个集合中分离出来。

集合的拆分：
　　将一个集合拆分成满足以下条件的集合
　　1）S1 ∪ S2 ∪ … ∪ Sk = S
　　2）Si ∩ Sj = ?（i ≠ j）
　　例：S = {1,2,3,4,5,6}；
　　拆分1：{1,2}，{3,4}，{5,6}；
　　拆分2：{1,2,3,4,5}，{6}；
　　……

Binary relations ：
　　S x S is the set of all pairs of elements of S (Cartesian product)；
　　Example: 若S = {a,b,c}，则 S x S = {(a,a),(a,b),(a,c),(b,a),(b,b),(b,c), (c,a),(c,b),(c,c)}；
　　A binary relation R on a set S is any subset of S x S，R(x,y) means (x,y) is “in the relation” .

Three Properties:
　　1) A relation R over set S is reflexive means R(a,a) for all a in S；
　　比如：S = {1, 2, 3}；
　　对于{<1, 1>, <1, 2>, <1, 3>, <2, 2>, <2, 3>, <3, 3>}
　　It is reflexive because <1, 1>, <2, 2>, <3, 3> are in this relation.
　　2) A relation R on a set S is symmetric if and only if for any a and b in S, whenever if R(a, b) , then R(b,a).
　　比如：S = {1, 2, 3}；
　　对于{<1, 1> , <2, 2> <3, 3> } , it is symmetric.
　　3) A binary relation R over set S is transitive means:
　　If R(a,b) and R(b,c) then R(a,c) for all a,b,c in S
　　比如：S = {1, 2, 3}；
　　对于{<1, 2> ,<2, 3> , <1, 3>}, It is transitive.

Equivalence relations：
　　A binary relation R is an equivalence relation if R is reflexive, symmetric, and transitive.
　　Suppose P={S1,S2,…,Sn} is a partition
　　　– Define R(x,y) to mean x and y are in the same Si
　　? R is an equivalence relation
　　Suppose R is an equivalence relation over S
　　　– Consider a set of sets S1,S2,…,Sn where
　　　(1) x and y are in the same Si if and only if R(x,y)
　　　(2) Every x is in some Si
　　? This set of sets is a partition
　　若S = {a,b,c,d,e}
　　? One partition: {a,b,c}, {d}, {e}
　　? The corresponding equivalence relation:
　　 (a,a), (b,b), (c,c), (a,b), (b,a), (a,c), (c,a), (b,c), (c,b), (d,d), (e,e)

并查集（Union-Find：
　　并查集是一种树型的数据结构，用于处理一些不相交集合（Disjoint Sets）的合并及查询问题。常常在使用中以森林来表示。进行快速规整。
　　在一些有N个元素的集合应用问题中，我们通常是在开始时让每个元素构成一个单元素的集合，然后按一定顺序将属于同一组的元素所在的集合合并，其间要反复查找一个元素在哪个集合中。其特点是看似并不复杂，但数据量极大，若用正常的数据结构来描述的话，往往在空间上过大，计算机无法承受；即使在空间上勉强通过，运行的时间复杂度也极高，适合用并查集来描述。
　　并查集在路径、网络、图形连接中得到了广泛应用。
　　并查集操作：
　　1）创建：创建一个集合的初始拆分，一般仅包含自身，如：
{a}, {b}, {c}, … ，并给每个子集选择一个代表元素。
　　2）查找：查找一个元素并返回其所在子集的代表元素
　　3）合并：合并两个小子集成一个大子集
　　例：
　　技术分享
　　
并查集数据结构：
Up-tree结构：
　　从单节点树的森林开始，如下：

　　经过合并后：

　　up代表一个数组,index代表每一节点，up[index]代表相应的parent，一开始up[index]都初始化为0或-1；合并后其值为对应子集的代表元素。

int find(int x) { 
while(up[x] != 0) { 
        x = up[x]; 
    } 
    return x; 
} 
void union(int x, int y){ 
 up[y] = x;  
}

　　显然，查找的最坏情况为O(n)，合并的最坏情况为O(1);m次查找和n-1次合并最坏情况为O(m*n)，如何优化？
　　优化策略：
　　1）按大小合并，将小树连接到大树，则查找时间复杂度为O(logn),m次查找和n-1次合并最坏情况为O(m*logn);
　　2）路径压缩：在查找时将路径上的节点直接连接到root

按大小合并：
技术分享
　　这里，up[root]不再是0或-1，而是存储该子集大小的相反数，当up[index]为一负值，则其为root。
　　查找最坏情况为：O(logn)

　　证明：当按照大小合并时，高度为h的up-tree节点至少为2^h，可通过数学归纳法证明。
　　此外，还可以按照高度合并。

路径压缩：
　　直接来看代码，你就懂了：

int find(i) { 
    // find root 
    int r = i；
    while(up[r] > 0) 
        r = up[r]  

    // compress path 
    if (i==r)
    return r; 
    int old_parent = up[i];
    while(old_parent != r) { 
        up[i] = r;
        i = old_parent; 
        old_parent = up[i];
    } 
    return r; 
}

技术分享
　　单次查找的最坏情况依然为O(logn)，但是经过多次查找，路径压缩后，m次查找和n-1次合并最坏情况几乎为O(m+n);

并查集：Union-Find（１）

标签：数据结构并查集 union-find disjoint

原文地址：http://blog.csdn.net/kzq_qmi/article/details/46367187

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行