Union-find
Table of Contents
1 steps to develop a usable algorithm
- model the problem
- find an algorithm to solve it
- fast enough? fits in memory
- if not, figure out why.
- find a way to address the problem
- iter until satisfied.
2 Model the problem
Given a set of N objects.
→ Union command : connect two objects
→ Find/connected query: is there a path connecting the two objects?
3 Application
1 computers in a network 2 friends in a network
4 interface specifications
public class UF
→ void union(int p,int q)
→ boolean connected(int p,int q)
5 hints:
build clients or tests before you code
6 solutions
6.1 Quick Find
6.1.1 data structure
data structure → array
1 array id[] of size N
2 p and q are connected iff they have the same id.
由于数据结构采用数组,故获得id及判断只用常量值 → QuickFind
- find 检测两个元素是否有相同id
- union 遍历所有id为p的元素,并置为q
public class UF { private int[] id; public void union(int p,int q) { int pid = id[p]; int qid = id[q]; /*这里遍历是为了防止情况 比如 1 2 2 3 为了让 1 2 3成组 不仅仅要修改2的id 还需要修改1的id */ for(int i=0;i<id.Length;i++) { if (id[i] == pid) id[i] = qid; } } public bool connected(int p, int q) { return id[p] == id[q]; } public void QuickFindUF(int N) { id = new int[N]; for(int i=0;i< N;i++) { id[i] = i; } } }
6.1.2 效率问题
connected → 常量值
union → 每次插入都需要遍历一边数组
由于数组大小为n,把所有数据union起来就要花n-1次 →
union的时间为 n^2
6.2 Quick Union
Figure 1: Quick Union
可以看出quick-union使用的数据结构和quick-find一样,不过使用的逻辑改变了。
- find 检测是否有相同的root节点
- union 把p的root设为q的root即可
public class UF { private int[] id; public void union(int p,int q) { int proot = findRoot(p); int qroot = findRoot(q); id[proot] = qroot; } public bool connected(int p, int q) { return findRoot(p)==findRoot(q); } private int findRoot(int x) { if (x == id[x]) return x; else return findRoot(id[x]); //or //while (x != id[x]) x = id[x]; //return x; } public void QuickUnionUF(int N) { id = new int[N]; for(int i=0;i< N;i++) { id[i] = i; } } }
6.2.1 效率问题
当tree为一棵时,查找最后一项与首项是否相同需要n次访问。
6.2.2 改进 + weighting
首先改进的方向应该是树的高度,我们应尽量让大树作为小树的节点。
Figure 2: Quickunion and weighted Quickunion
改动主要代码如下:
public void union(int p,int q) { int proot = findRoot(p); int qroot = findRoot(q); if (height[p] > height[q]) { id[qroot] = proot; height[proot] += height[qroot]; } else { id[proot] = qroot; height[qroot] += height[proot]; } }
树的深度最多为lgN.
分析: 树的增长只会在union中发生,而发生union树的增高最大幅度只会是两倍。
假设T1树含有x,想要增加高度1,那么需要T2树的重量即元素个数>=T1
那么融合以后,含有x的树的大小至少会翻倍。以此类推,每翻倍一次,高度+1
翻了y次后树的大小最大为n → x*2y=n → ymax =lgn
6.2.3 效率总结
alg | construct | union | connected |
quickfind | N | N | constant |
quickunion | N | constant | N |
quickunionWithWeight | N | constant | lgN |
6.2.4 改进+path compression
由于每次我们找root时,都会遍历所有祖先节点。在遍历祖先节点时,我们可以顺便把这些节点也指向 根节点。这样整个树就会呈现出扁平化。只需要改变一行代码:
private int findRoot(int x) { //if (x == id[x]) return x; //else return findRoot(id[x]); //or while (x != id[x]) { id[x] = id[id[x]]; // !!!扁平化 x = id[x]; } return x; }