The following article is an example of a .NET implementation of Kruskal’s algorithm for finding the minimum spanning tree of a connected, undirected graph.
The minimum spanning tree for a graph is the set of edges that connects all nodes and has the lowest cost.
In order to be able to run this solution, you will need .NET 4.0. The example was constructed using Visual Studio 10, and WPF for the graphical representation.
In this article, I will be using a large portion of the code that I used to exemplify Dijkstra’s algorithm for finding the minimum distance path between two nodes
in a connected undirected graph. You can find this example here.
Similar to Dijsktra, Kruskal’s algorithm is also characterized as “greedy.” Even though these algorithms are quite similar, their approaches are quite different since
their goals are simply not the same. In the case of minimum spanning tree, there are some additional structures that we will be using, making the understanding of the algorithm
a little bit more difficult.
In addition, the minimum spanning tree is a more abstract concept and a little more difficult to compare to a practical situation.
In the case of minimum distance algorithm, we thought of the graph as a map of cities where the nodes represent the cities and the edges – the roads that connect them.
We could imagine a similar scenario for a minimum spanning tree. For example, we could imagine that we are an internet provider that is currently building their network
in a given country and that we use optic fiber to connect all the cities in it. Since optic fiber costs us money, we would like to use as little as possible. The minimum spanning
tree algorithm can help us do that.
As already mentioned, Kruskal’s minimum spanning tree is similar to Dijkstra’s shortest path in the way that both are “greedy” algorithms.
When we were looking for the shortest path, we were trying to select the best possible edge from the node with the smallest total cost incurred.
The reason why this is an efficient strategy is because it reduces the number of nodes that we need to visit in order to get to the target.
In order to find the minimum spanning tree, we need a slightly different approach. To construct the tree in an efficient manner, we need to visit all nodes by visiting
as few edges as possible. Therefore, it makes sense to collect all the edges of the graph and have them ordered so that we consider the least costly ones first.
However, that is not enough since it does not guarantee us that we will select only the edges that we need to construct the minimum spanning tree. Let’s consider
the following simple case. We have a graph with three nodes that are interconnected. We start collecting the edges to construct the minimum spanning tree.
We start with the edge between nodes 1 and 2. The next logical one would be the edge between nodes 2 and 3. At this point, there is a direct connection between
nodes 1 and 2, 2 and 3, and there is an indirect connection between 1 and 3 by going through node 2. Therefore, it would be a mistake to add the last edge to the minimum spanning tree.
Therefore, the minimum spanning tree consists only of the edges between nodes 1 and 2 and 2 and 3. We need a way to keep track of the edges that we have already
selected for the minimum spanning tree and the nodes that we have visited. In order to do that, we will use an additional structure that we will call cluster.
A cluster is simply a collection of nodes that we have already visited. As we keep adding edges to the minimum spanning tree, we need to check if each node the edge
connects is already in the cluster or not. If both nodes have been already added, the edge should not be included in the minimum spanning tree, even if it is the shortest edge in the list.
Having defined the role of the cluster, it becomes clear that at the start of the algorithm, we will need to have a cluster for each node.
As we keep adding edges to the minimum spanning tree, we will keep merging the clusters until we are left with what encompasses all the nodes in the graph.
The best way to exemplify the algorithm is to view an example that is a bit more complete than the example before.
Starting with the edge with the minimum cost, we will first add the edge between 2 and 5 to the minimum spanning tree and we will first merge the clusters of nodes 2 and 5.
We will apply the same logic for nodes 1 and 6 and the edge between them. The next edge we should consider is the one between nodes 3 and 4 or between 6 and 4
since they have the same cost of 15. Let’s imagine that we take the edge between nodes 3 and 4. We find ourselves in the same situation as before.
The nodes are part of different clusters; therefore, we proceed in the same fashion. The next edge in line is the one between nodes 4 and 6.
The nodes are from different clusters: node 6 is in the same cluster as node 1, and node 4 is in the same cluster as node 3. We add the edge to the minimum
spanning tree and merge the two clusters into a single that will contain nodes 1, 6, 4, and 3. At this point, there are only two clusters left, the one we just
constructed and the one containing nodes 2 and 5. Therefore, the next edge that we select must allow us to merge these two clusters and it will be the last edge
in the minimum spanning tree. The edge with lowest cost is the one between nodes 4 and 5. This yields a minimum spanning tree with a total cost of 63.
Using the Application
In order to use the application, first the user needs to create the nodes and the edges between them. To create a node, the user needs to click on the canvas.
In order to connect two nodes, the user needs to consecutively select two nodes created on the canvas. To find the minimum spanning tree, the user needs to click
on “Find Minimum Span Tree”. The “Clear” button allows the user to clear the canvas and build a new graph. “Restart” allows the user to go back to the initial condition
of the graph before finding the minimum distance.