Critical Analysis of Kruskal’s Algorithm

Rishabh Khandelwal
7 min readApr 11, 2022

--

Let’s start with introducing Kruskal’s algorithm in easy terms.

Introduction —

Kruskal’s algorithm is one of the most popular algorithms for finding the Minimum Spanning Tree(MST) of an undirected edge-weighted graph.
It is a greedy algorithm . This algorithm was written by Joseph Kruskal and it first appeared in Proceedings of the American Mathematical Society in the year 1956.

In Kruskal’s algorithm where we add the lowest-weighted edges in a new graph keeping in mind that a cycle must never be formed. We do this until we have all the nodes or vertices in the graph. This method of finding the minimum spanning tree is called Kruskal’s algorithm.

Deep Discussion on Kruskal’s Algorithm

Basic idea of Kruskal Algorithm —

Assume a connected graph G1, with n number of vertices — G1 = {V, E}
Also assume a non-connected graph G2, without edges — G2 = {V, φ}

We first increase the cost of all the edges in graph G1, and then eliminate the edge with the least cost. If the selected edge falls on two different vertices, it can be directly selected into the graph G2. Otherwise, give up the edge, and then again choose the minimum cost edge from the remaining edges and add it to the graph G2. This step is to be repeated again and again until all the vertices are merged on the newly made graph G2. In total, n-1 number of edges will be there in G2, and the minimum cost of the spanning tree would be equal to sum of weights of all the edges.

The procedure of Kruskal Algorithm —

Implementation:

  1. Sort all the edges present by increasing order of their weights.
  2. Select the edge with the lowest weight, and use it to connect to the vertices of the graph. Here, we keep in mind that a cycle or loop must never be formed. If a cycle or loop is formed by selecting any edge, then we don’t include that edge in the graph.
  3. Step 2 is repeated over and over until all the vertices are connected, and a minimum spanning tree is obtained.

Pen and paper implementation:

Let’s solve this by taking graph in Figure 1 as an example.

Figure 1

First step is to define an empty new graph. Then, we need to choose an edge with the least possible weight, After this, we’ll be choosing edge connecting the vertices D and E, as it has the least weight being 1 units.

Figure 2 shows how our graph will look after completing these steps.

Figure 1

Then, we’ll again choose the edge with least weight. That edge is DC with 2 units weight.

Figure 3 shows how our graph will look after completing these steps.

Figure 3

Now, we’ll again look for the edge with least weight. That edge is CA, connecting vertices C and A with weight 3 units.

Figure 4 shows how our graph will look after completing these steps.

Figure 4

After this, we’ll again look for the edge with least weight. That edge is edge CE with 4 units weight. But, we’ll disregard that edge because the vertex C and E are already present, making a loop.

So, we’ll look for the next least weighted edge, which is BD with 6 units weight.

The obtained result is shown in Figure 5. It’s the minimum spanning tree possible from the graph.

Figure 5

Let’s calculate the weight of this minimum spanning tree.

Weight of DE = 1 units,
Weight of DC = 2 units,
Weight of CA = 3 units,
Weight of DB= 6 units

Weight of minimum spanning tree = Sum of all the weights
So, Weight of MST = 1 + 2 + 3+ 6 = 12 units

Algorithm:

KRUSKAL(G):
A = ∅
For each vertex v ∈ G.V:
MAKE-SET(v)
sort the edges of G.E into increasing order by weight
For each edge (u, v) ∈ G.E ordered by increasing order by weight(u, v):
if FIND-SET(u) ≠ FIND-SET(v):
A = A ∪ {(u, v)}
UNION(u, v)
return A

Detailed time complexity analysis of Kruskal’s algorithm —

Calculating time complexity

Time complexity analysis

T(n) = O(1) + O(V) + O(E log E) + O(V log V)
T(n) =O(E log E) + O(V log V)
[As |E| ≥ |V| — 1]
T(n) = E log E + E log E
T(n) = E log E

Hence, time complexity of Kruskal’s algorithm is O(E log E), or we can also say O(E log V) because;

E <= V * V
So, by taking log on both sides, log(E)<=2 log(V)

Hence, time complexity of Kruskal is O(E log E) or O(E log V).

Detailed analysis of time complexity

  • All the edges are maintained as min heap.
  • The next edge can be obtained in O(log E) runtime, where E is number of edges.
  • Reconstruction of heap takes O(E) time.
  • Hence, this algorithm takes O(E log E) time.
  • The value of E can be at most O(V²).
  • So, O(log V) and O(log E) are same.
  • Therefore, time complexity of Kruskal’s Algorithm is O(E log E) or O(E log V).

Worst case time complexity of Kruskal’s Algorithm

Worst case will happen when;

  • the graph contains maximum number of edges possible which equals to n(n-1)/2,
  • all the edges are not sorted, and
  • our MST includes the edge with maximum weight.

In this situation, we need to check all the edges and make sure all of the edges with least weight are in the minimum spanning tree. Then,

Sorting all the edges by using merge sort would take O(E log E) time. We sort it by merge sort because that is the most optimal choice.

The MAKE-SET function would take a runtime of O(V), as it creates a subset for all of the vertices.

Union and find function makes sure that the final tree structure stays at minimum height, resulting in an overall execution time of O(E log E) for both the functions.

Therefore, in the worst case, the total complexity comes out to be O(E log E), as shown below:

T(n) = E log E + V + E log E
T(n) = E log E

Best case time complexity of Kruskal’s Algorithm

Best case will be when;

  • the graph has lowest number of possible edges connecting all the nodes or vertices
  • all the edges have already been sorted.

Then,

All the edges are already sorted, therefore it’ll have O(1) runtime.

The MAKE-SET will have the same runtime as the worst case, which is O(V).

Theoretically, the union and find functions would have runtime of Log(E) for each edge. When we do this for all valid edges and vertices, the total time complexity will be O(E log E).

As E = V-1, the total time complexity comes out to be O(E log E).

Average case time complexity of Kruskal’s Algorithm

In average case;

The sorting will take place. We will use merge sort for this as used in worst case situation. Therefore, it will take O(E log E) runtime.

Again, the MAKE-SET function would take a runtime of O(V), as it creates a subset of all of the vertices.

In union and find function, we consider two subtrees or two vertices at a time. So, the execution time here will be same as worst case, i.e., O(E log E).

Hence, the time complexity in average case equals to O(E log E).

Space complexity of Kruskal’s Algorithm

A space requirement of O(V) to keep track of all vertices in the beginning and the respective subsets.

To keep track of all the vertices and the subsets, O(V) space is required.

A space requirement of O(E) to keep track of all the valid sorted edges to be included in the final MST.

Therefore, the total space complexity turns out to be O(E+V).

Summary —

Worst case time complexity: O(E log E)
Best case time complexity: O(E log E)
Average case time complexity: O(E log E)
Space complexity: O(E+V)

Difference between Prim’s and Kruskal’s Algorithm

Before jumping into difference, let’s briefly discuss Prim’s algorithm.

Prim’s Algorithm

Just like Kruskal’s algorithm, Prim’s algorithm is also used for finding the Minimum Spanning Tree(MST) of an undirected edge-weighted graph.
Prim’s algorithm is also a greedy algorithm. According to Prim’s algorithm, we start with a single source node and then we continuously keep on exploring the adjacent nodes of the source node to find the shortest weight possible.

You can read more about Prim’s algorithm in my blog here.

Difference between Prim’s and Kruskal Algorithm

  • We travel through the node only once in Kruskal’s algorithm, whereas in Prim’s we travel through all the nodes more than once to get the minimum difference.
  • The time complexity of Prim’s algorithm is O(E²) whereas the complexity of Kruskal’s algorithm is O(E log V), where,
    Number of edges = E,
    Number of vertices = V
  • Kruskal algorithm is easier to implement when compared to Prim’s.
  • Kruskal is faster when we are finding minimum spanning tree for sparse graphs, whereas Prim’s is faster when we are finding MST for dense graphs(Figure 6 shows sparse and dense graphs).
Figure 6

Advantage of Kruskal’s algorithm over Prim’s algorithm

Modeling problems in real world are big, sparse and potentially disconnected, and Kruskal’s algorithm performs better for sparse graphs, it seems to be a better fit for the real world.

--

--