Union-find

1. steps to develop a usable algorithm
2. Model the problem
3. Application
4. interface specifications
5. hints:
6. solutions
7. ref

1 steps to develop a usable algorithm

model the problem
find an algorithm to solve it
fast enough? fits in memory
if not, figure out why.
find a way to address the problem
iter until satisfied.

2 Model the problem

Given a set of N objects.
→ Union command : connect two objects
→ Find/connected query: is there a path connecting the two objects?

3 Application

1 computers in a network 2 friends in a network

4 interface specifications

public class UF
→ void union(int p,int q)
→ boolean connected(int p,int q)

5 hints:

build clients or tests before you code

6 solutions

6.1 Quick Find

6.1.1 data structure

data structure → array
1 array id[] of size N
2 p and q are connected iff they have the same id.
由于数据结构采用数组，故获得id及判断只用常量值 → QuickFind

find 检测两个元素是否有相同id

union 遍历所有id为p的元素，并置为q

public class UF
    {
	private int[] id;

	public void union(int p,int q)
	{
	    int pid = id[p];
	    int qid = id[q];
	    /*这里遍历是为了防止情况 
	    比如 1 2
		 2 3 为了让 1 2 3成组
		 不仅仅要修改2的id 还需要修改1的id

	    */ 
	    for(int i=0;i<id.Length;i++)
	    {
		if (id[i] == pid) id[i] = qid; 
	    }
	}

	public bool connected(int p, int q)
	{
	    return id[p] == id[q];
	}

	public void QuickFindUF(int N)
	{
	    id = new int[N];
	    for(int i=0;i< N;i++)
	    {
		id[i] = i;
	    }
	}


    }

6.1.2 效率问题

connected → 常量值

union → 每次插入都需要遍历一边数组
由于数组大小为n，把所有数据union起来就要花n-1次 →
union的时间为 n^2

6.2 Quick Union

技术分享

Figure 1: Quick Union

可以看出quick-union使用的数据结构和quick-find一样，不过使用的逻辑改变了。

find 检测是否有相同的root节点
union 把p的root设为q的root即可

public class UF
   {
       private int[] id;

       public void union(int p,int q)
       {
           int proot = findRoot(p);
           int qroot = findRoot(q);
           id[proot] = qroot;
       }

       public bool connected(int p, int q)
       {
           return findRoot(p)==findRoot(q);
       }

       private int findRoot(int x)
       {
           if (x == id[x]) return x;
           else return findRoot(id[x]);
           //or
           //while (x != id[x]) x = id[x];
           //return x;
       }

       public void QuickUnionUF(int N)
       {
           id = new int[N];
           for(int i=0;i< N;i++)
           {
               id[i] = i;
           }
       }


   }

6.2.1 效率问题

当tree为一棵时，查找最后一项与首项是否相同需要n次访问。

6.2.2 改进 + weighting

首先改进的方向应该是树的高度，我们应尽量让大树作为小树的节点。

技术分享

Figure 2: Quickunion and weighted Quickunion

改动主要代码如下:

public void union(int p,int q)
{
    int proot = findRoot(p);
    int qroot = findRoot(q);
    if (height[p] > height[q])
    {
        id[qroot] = proot;
        height[proot] += height[qroot];
    }
    else
    {
        id[proot] = qroot;
        height[qroot] += height[proot];
    }

}

树的深度最多为lgN.

分析：树的增长只会在union中发生，而发生union树的增高最大幅度只会是两倍。

假设T1树含有x，想要增加高度1，那么需要T2树的重量即元素个数>=T1

那么融合以后，含有x的树的大小至少会翻倍。以此类推，每翻倍一次，高度+1

翻了y次后树的大小最大为n → x*2^y=n → y_max =lgn

6.2.3 效率总结

alg	construct	union	connected
quickfind	N	N	constant
quickunion	N	constant	N
quickunionWithWeight	N	constant	lgN

6.2.4 改进+path compression

由于每次我们找root时，都会遍历所有祖先节点。在遍历祖先节点时，我们可以顺便把这些节点也指向根节点。这样整个树就会呈现出扁平化。只需要改变一行代码：

private int findRoot(int x)
 {
     //if (x == id[x]) return x;
     //else return findRoot(id[x]);
     //or
     while (x != id[x])
     {
         id[x] = id[id[x]]; //  !!!扁平化
         x = id[x];

     }
     return x;
 }

6.3 应用

7 ref

PrincetonUniversity

Union-find

Union-find

Table of Contents

1 steps to develop a usable algorithm

2 Model the problem

3 Application

4 interface specifications

5 hints:

6 solutions

6.1 Quick Find

6.1.1 data structure

6.1.2 效率问题

6.2 Quick Union

6.2.1 效率问题

6.2.2 改进 + weighting

6.2.3 效率总结

6.2.4 改进+path compression

6.3 应用

7 ref