Shortest Path: Unweighted Graph

  • Formally describe the shortest path problem.
  • Recognize that (unweighted) shortest path algorithm is a modified BFS.

Let us formally define the shortest path problem:

Shortest Path Problem

Input: Graph G=(V,E)G=(V, E), and a starting vertex sVs \in V.
Goal: Identify a shortest path from ss to every vertex in GG.

Note we are considering the length of a path here. In other words, "shortest path" means "shortest length." (We will soon contrast this with the case where shortest path means shortest cost.)

Lemma: Let GG be a directed or undirected graph. At the conclusion of the modified BFS, for every vertex vVv \in V, the value distance[v] equals the length dist(s,v)\text{dist}(s, v) of a shortest path from ss to vv in GG (or ++ \infty if no such path exists).

This is not the case for DFS!

Why does BFS find the shortest path?

Okay! Let's think about this problem for a moment. We have a source vertex ss and a target vertex tt. We want a shortest path from ss to tt (if exists). What can we do? Well, we can look at all the vertices that are one edge away from ss; if we find tt, we are done. If not, we can look at all vertices that are two edges away from ss, then those that are three edges away from ss, and so on until we find tt. This strategy is what BFS is doing! It explores all (reachable) vertices that are kk edges away from the source vertex ss before visiting any vertex k+1k+1 edges away. So, if there are multiple paths to tt, and one returned by BFS has dd edges, that path is the shortest. If there was a shorter path with, e.g., length of d1d-1, then tt would have been explored before exploring vertices that are dd edges away from the source.

For a formal proof, refer to CLRS, Third Edition, Lemma 22.2 on page 598.

Resources