Directed Graph
A directed graph G = (V, E) is a graph such that E is not necessarily symmetric.
Topological Sorting
We want to figure out an ordering < of vertices such that:
If A < B, then A can reach B by a path but B can’t reach A by a path.
If such an ordering exists, the directed graph G can be topologically sorted. [If 1<–>2, then the graph can’t be topologically sorted(in this version of def)]
- Can we tell if a graph can be topologically sorted?
- If yes, how can we find a way?
Acyclic Graph
if there if no path in the graph such that the path form a cycle
Property: A directed graph G can be topologically sorted if and only if the graph is an acyclic graph
“if”: we will design an algorithm to do it.
“only if”: by the def, then for any pair (a,b) of vertices, we cannot have both a can get to b by a path, and b can get to a by a path. [just by definition, then this direction of proof is done]
We can then define:
- sinks (maxima) as out-degree 0 vertices
- sources (minima) as in-degree 0 vertices
- Sinks and sources are not necessarily unique
- Let S = V
- Start from a source vertex v in S with only edges going out
- Remove v out of S, and remove the edges related to v out of E, and repeat 1 until S is empty. The order of removal of v is a valid topological sort of the vertices V
Claim: If ∀ v ∈ V , i n − d e g ( v ) > 0 \forall v\in V, in-deg(v) >0 ∀v∈V,in−deg(v)>0, then there is a cycle
Pf: just start with one vertex v, and go backward along the edge that comes into v, since all vertices are having at least 1 edge coming in, we will eventually go back to a vertex that we already have seen which forms a loop.
Implementation
If graph is given by Linked List, the running time could be O(|E|)
Algorithm:
start from a vertex v of in-degree 0.
Question is: how to do efficiently find the next vertex of in-degree 0 with v removed, if you go over all the vertices again, it will result in O(|V|^2) algorithm not O(|E|)
How about considering a stack to store all 0s?
Idea is: have 2 sets, D and B, where B stores unexplored vertices with in-degree 0, D stores the current in-degrees of vertices in the remaining graph
(*) B can be a stack, each time we pop an element b from B to explore (remove), update the vertices degree in D with removal of the out-going edges from b. (using O(out-degree of b)), if the degree of a vertex u recorded in D is zero after update, push it into B.
Keep doing (*) until B is empty.
Each vertex only gets push/pop into B once, each edge only gets visited once. Therefore, the running time is O(|E|)
This is a widely application alogorithm for example, you may have a list of jobs to finish. Or, there are 2 workers, their works have to be done somehow in order, how to reduce total working time?
DFS on a directed graph, and edge classifications
Edges:
- Tree edges: edge that was travelled by the DFS
- backward edges (–): edges that pointing from a visitd vertex to its ancestor
- forward edges (—): edges that pointing from a vertex to its descendant (exists in directed graph)
- cross edges(-—-): everything else (exists in directed graph)
Lemmas of DFS on directed graph
Lemma 1: in a directed graph G, if there is a path from u to v, then DFS starting at u will have the DFS tree covering every vertex along the path and v and hence v is a descendant of u in the tree.
pf: for the first edge (u, x) on the path, if x is unexplored by other route, a backtrack will explore x.
Then, start with x, (x, x1) we can apply the same argument, hence till v.
Thm
Theorem: a directed graph is acyclic iff a DFS produces no backward edge
$\Rightarrow: $ easy.
⇐ : \Leftarrow: ⇐: (by contradiction) suppose there is a directed cycle C, we need to prove that there will be a backward edge. DFS will hit a vertex v on C at some earliest point, let (u,v) be the edge on C pointing to v; assume all edges on C are tree edges, forward edges and cross edges.
- If (u,v) is a tree edge,
since v is visited first
- If (u,v) is a forward edge,
since v is visited first
- If (u,v) is a cross edge, then u is visited later on, after backtracking to v’s ancestor or in another new DFS search.
Strongly connected components in directed graphs
recall: connected components of undirected graph is a partition of vertices such that within each subset, vertices are mutually reachable.
Let’s define a relationship R be (a,b) in R if a can reach b by a path in directed G and b can reach a by a path in G. Then R is an equivalent relation. The equivalent classes of R is called strongly connected components (SCCs) of G.
SCC meta graph: suppose there are k SCCs of graph G, then we can further define a new graph C from G as U = {SC1, SC2, SC3,…, SCk}, E_meta = {(SCi, SCj) if there is an edge in G going from SCi to SCj}.
SCC meta graph
Property: the graph C defined from G is acyclic, otherwise, it violates the definition of SCCs.
Therefore, we can topologically sort the SCC META graph.
Question: how to find SCC meta graph efficiently? say, in linear time?