A Minimum Spanning Tree (MST) is a subset of the edges of a connected, undirected graph that connects all the vertices together, without any cycles and with the minimum possible total edge weight. It is a way to connect all the vertices in a graph in a way that minimizes the total weight of the edges in the tree.
MST is a fundamental problem with diverse applications.
Network design
- telephone, electrical, hydraulic, TV cable, computer, road
The standard application is to a problem like phone network design. You have a business with several offices; you want to lease phone lines to connect them up with each other, and the phone company charges different amounts of money to connect different pairs of cities. You want a set of lines that connects all your offices with a minimum total cost. It should be a spanning tree, since if a network isn’t a tree you can always remove some edges and save money.
Approximation algorithms for NP-hard problems
A less obvious application is that the minimum spanning tree can be used to approximately solve the traveling salesman problem. A convenient formal way of defining this problem is to find the shortest path that visits each point at least once.
Note that if you have a path visiting all points exactly once, it’s a special kind of tree. For instance in the example above, twelve of sixteen spanning trees are actually paths. If you have a path visiting some vertices more than once, you can always drop some edges to get a tree. So in general the MST weight is less than the TSP weight, because it’s a minimization over a strictly larger set.
On the other hand, if you draw a path tracing around the minimum spanning tree, you trace each edge twice and visit all points, so the TSP weight is less than twice the MST weight. Therefore this tour is within a factor of two of optimal.
Indirect applications
- Max bottleneck paths
- LDPC codes for error correction
- Image registration with Renyi entropy
- Learning salient features for real-time face verification
- Reducing data storage in sequencing amino acids in a protein
- Model locality of particle interactions in turbulent fluid flows
- Autoconfig protocol for Ethernet bridging to avoid cycles in a network
Cluster analysis
k clustering problem can be viewed as finding an MST and deleting the k-1 most expensive edges.
Image segmentation: MSTs can be used in image segmentation to segment an image into different regions.
Bioinformatics: MSTs are used to construct phylogenetic trees in bioinformatics which represent the evolutionary relationship between different species.
Facility location: MSTs can be used to determine the optimal location of facilities, such as warehouses or power plants, in a network.
Geographic Information Systems(GIS): MSTs can be used in Geographic Information Systems (GIS) to create a map of a region with the minimum possible total distance between the locations.
Ready to dive in? Explore our Free Demo Content and join our DSA course, trusted by over 100,000 neveropen!