Shortest Path: Unweighted Graph

Formally describe the shortest path problem.
Recognize that (unweighted) shortest path algorithm is a modified BFS.

Let us formally define the shortest path problem:

Shortest Path Problem

Input: Graph $G=(V, E)$ , and a starting vertex $s \in V$ .
Goal: Identify a shortest path from $s$ to every vertex in $G$ .

Note we are considering the length of a path here. In other words, "shortest path" means "shortest length." (We will soon contrast this with the case where shortest path means shortest cost.)

Lemma: Let $G$ be a directed or undirected graph. At the conclusion of the modified BFS, for every vertex $v \in V$ , the value distance[v] equals the length $\text{dist}(s, v)$ of a shortest path from $s$ to $v$ in $G$ (or $+ \infty$ if no such path exists).

This is not the case for DFS!

Why does BFS find the shortest path?

Okay! Let's think about this problem for a moment. We have a source vertex $s$ and a target vertex $t$ . We want a shortest path from $s$ to $t$ (if exists). What can we do? Well, we can look at all the vertices that are one edge away from $s$ ; if we find $t$ , we are done. If not, we can look at all vertices that are two edges away from $s$ , then those that are three edges away from $s$ , and so on until we find $t$ . This strategy is what BFS is doing! It explores all (reachable) vertices that are $k$ edges away from the source vertex $s$ before visiting any vertex $k+1$ edges away. So, if there are multiple paths to $t$ , and one returned by BFS has $d$ edges, that path is the shortest. If there was a shorter path with, e.g., length of $d-1$ , then $t$ would have been explored before exploring vertices that are $d$ edges away from the source.

For a formal proof, refer to CLRS, Third Edition, Lemma 22.2 on page 598.

Resources

Wikipedia's entry on Shortest Path Problem.

Data Structures

Shortest Path: Unweighted Graph