title | toc | date | tags | top | |
---|---|---|---|---|---|
Topic 1 - Union Find |
false |
2017-10-30 |
|
1 |
The graph connectivity problem is the following: Given an undirected graph G, preprocess the graph so that queries of the form “are nodes
Connected components: Maximal set of objects that are mutually connected.
Union-Find(并查集) API:
- the union() operation merges two components if the two sites are in different components.
- the find() operation returns an integer component identifier for a given site.
- the connected() operation determines whether two sites are in the same component.
- the count() method returns the number of components.
Quick-Find Idea: all sites in a component must have the same value in id[].
This method is called quick-find because find(p) just returns id[$p$], which immediately implies that connected(p, q) reduces to just the test id[$p$] == id[$q$] and returns true, if and only if
Union(): To combine the two components into one, we have to make all of the id[] entries corresponding to both sets of sites the same value. To do so, we go through the array, changing all the entries with values equal to id[$p$] to the value id[$q$].
public class QuickFind {
private int[] id;
public QuickFind(int N) {
id = new int[N];
// set id of each object to itself
for (int i = 0; i < N; i++) id[i] = i;
}
public boolean connected(int p, int q) {
return id[p] == id[q];
}
public int find(int p) {
return id[p];
}
public void union(int p, int q) {
int pid = id[p];
int qid = id[q];
// change all entries with id[p] to id[q]
for (int i = 0; i < id.length; i++)
if (id[i] == pid) id[i] = qid;
}
}
The Quick-Find algorithm uses one array access for each call to find() and between
WORST CASE: suppose we wind up with a single component. This requires at least
Data structure:
- Interpretation: id[$i$] is parent of
$i$ . - Root of
$i$ is id[id[id[...id[$i$]...]]] (keep going until it doesn’t change)
Specifically, the id[] entry, for each site is the name of another site in the same component (possibly itself) — we refer to this connection as a link. To implement find(), we start at the given site, follow its link to another site, follow that site’s link to yet another site, and so forth, following links until reaching a root, a site that has a link to itself. Two sites are in the same component if and only if this process leads them to the same root.
union(p, q): we follow links to finds the roots associated with
public class QuickUnion {
private int[] id;
private int count;
public QuickUnion(int N) {
count = N;
id = new int[N];
// set id of each object to itself (N array accesses)
for (int i = 0; i < N; i++) id[i] = i;
}
private int find(int p) {
// chase parent pointers until reach root
while (p != id[p]) p = id[p];
return p;
}
public boolean connected(int p, int q) {
//check if p and q have same root
return find(p) == find(q);
}
public void union(int p, int q) {
// change root of p to point to root of q
int i = find(p);
int j = find(q);
id[i] = j;
count--;
}
}
The number of array accesses used by find() in quick-union is 1 plus the twice the depth of the node corresponding to the given site. The number of array accesses used by union() and connected() is the cost of the two find() operations (plus 1 for union() if the given sites are in different trees).
WORST CASE: suppose we wind up with a single component, the running time is quadratic.
Weighted quick-union: Rather than arbitrarily connecting the second tree to the first for union(), we keep track of the size of each tree and always connect the smaller tree to the larger. This change needs another array to hold the node counts.
public class WeightedQuickUnion {
private int[] id; // parent link (site indexed)
private int[] sz; // size of component for roots (site indexed)
private int count; // number of components
public WeightedQuickUnion(int N) {
count = N;
id = new int[N];
for (int i = 0; i < N; i++) id[i] = i;
sz = new int[N];
for (int i = 0; i < N; i++) sz[i] = 1;
}
public int count() {
return count;
}
public boolean connected(int p, int q) {
return find(p) == find(q);
}
private int find(int p) { // Follow links to find a root.
while (p != id[p]) p = id[p];
return p;
}
public void union(int p, int q) {
int i = find(p);
int j = find(q);
if (i == j) return;
// Make smaller root point to larger one.
if (sz[i] < sz[j]) {
id[i] = j; sz[j] += sz[i];
} else {
id[j] = i; sz[i] += sz[j];
}
count--;
}
}
The depth of any node in a forest built by weighted quick-union for
The worst case for weighted quick union is when the sizes of the trees to be merged by union() are always equal (and a power of 2). And they have the simple property that the height of a tree of
For weighted quick-union with
Specifically, the weighted quick-union algorithm uses at most
IDEA: Just after computing the root of
Two-pass implementation: add second loop to find() to set the id[] of each examined node to the root.
Simpler one-pass variant: Make every other node in path point to its grandparent (thereby halving path length).
private int find(int p) { // Follow links to find a root.
while (p != id[p]) {
// only one extra line of code !
id[p] = id[id[p]];
p = id[p];
}
return p;
}
Starting from an empty data structure, any sequence of