Previous: KruskalMST, Up: Graphs
The union-find system maintains a collection of disjoint sets that cover a "universal" set of interest. For example the universe might be the set of vertices of a graph. At any time the sets in the collection are disjoint and cover the universe. This means that each element of the universe is in exactly one set of the collection.
The key functions are initialize(), union(x, y), y = find(x). We will use a metaphore of companies. The elements in the universe are "workers", find(x) returns the CEO of the company x works for. union(x, y) performs a merger of the companies that x and y work for. initialize() establishes a situation in which everyone is self employed, each worker in the universe is a company of one.
The system works by maintaining a vector of "bosses". For each worker x, boss[x] records the current boss of x. If x is a CEO, boss[x] == x. Union is by rank. Always rank[x] < rank[boss[x]], unless x is a CEO.
initialize(): for all x, boss[x] = x, rank[x] = 0.
union(x, y) a = find(x); b = find(y); if (rank(a) < rank(b)) boss[a] = b; if (rank(a) > rank(b)) boss[b] = a; if (rank(a) == rank(b)) {boss[b] = a; rank[a] += 1;}
find(x) if (boss[x] == x] return x; boss[x] = find(boss[x]); // path compression return boss[x];
The union by rank ensures depth of company heirarchy is at most lg(n), for a universe of size n.
The path compression ensures that a sequence of k union and find operations costs O(k log*(n)).