A path from vertex x to y in a graph is a list of vertices in which successive vertices are connected by edges in the graph.. In the adjacency structure representation all the vertices c
Trang 1A great many problems are naturally formulated in terms of objects and connections between them For example, given an airline route map of the eastern U S., we might be interested in questions like: “What’s the fastest way to get from Providence to Princeton?” Or we might be more interested in money than in time, and look for the cheapest way to get from Providence to Princeton To answer such questions we need only information about interconnections (airline routes) between objects (towns)
Electric circuits are another obvious example where interconnections be-tween objects play a central role Circuit elements like transistors, resistors, and capacitors are intricately wired together Such circuits can be represented and processed within a computer in order to answer simple questions like “Is everything connected together?” as well as complicated questions like “If this circuit is built, will it work?” In this case, the answer to the first ques-tion depends only on the properties of the interconnecques-tions (wires), while the answer to the second question requires detailed information about both the wires and the objects that they connect
A third example is “job scheduling,” where the objects are tasks to be performed, say in a manufacturing process, and interconnections indicate which jobs should be done before others Here we might be interested in answering questions like “When should each task be performed?”
A graph is a mathematical object which accurately models such situations.
In this chapter, we’ll examine some basic properties of graphs, and in the next several chapters we’ll study a variety of algorithms for answering questions of the type posed above
Actually, we’ve already encountered graphs in several instances in pre-vious chapters Linked data structures are actually representations of graphs, and some of the algorithms that we’ll see for processing graphs are similar to algorithms that we’ve already seen for processing trees and other structures
373
Trang 2For example, the finite-state machines of Chapters 19 and 20 are represented with graph structures
Graph theory is a major branch of combinatorial mathematics and has been intensively studied for hundreds of years Many important and useful properties of graphs have been proved, but many difficult problems have yet to
be resolved We’ll be able here only to scratch the surface of what is known about graphs, covering enough to be able to understand the fundamental algorithms
As with so many of the problem domains that we’ve studied, graphs have only recently begun to be examined from an algorithmic point of view Although some of the fundamental algorithms are quite old, many of the interesting ones have been discovered within the last ten years Even trivial graph algorithms lead to interesting computer programs, and the nontrivial algorithms that we’ll examine are among the most elegant and interesting (though difficult to understand) algorithms known
Glossary
A good deal of nomenclature is associated with graphs Most of the terms have straightforward definitions, and it is convenient to put them in one place even though we won’t be using some of them until later
A graph is a collection of vertices and edges Vertices are simple objects
which can have names and other properties; an edge is a, connection between two vertices One can draw a graph by marking points for the vertices and drawing lines connecting them for the edges, but it must be borne in mind that the graph is defined independently of the representation For example, the following two drawings represent the same graph:
We define this graph by saying that it consists of the set of vertices A B C D
E F G H I J K L M and the set of edges between these vertices AG Al3 AC
L M J M J L J K E D F D H I F E A F ’ G E
Trang 3For some applications, such as the airline route example above, it might not make sense to rearrange the placement of the vertices as in the diagrams above But for some other applications, such as the electric circuit application above, it is best to concentrate only on the edges and vertices, independent
of any particular geometric placement And for still other applications, such
as the finite-state machines in Chapters 19 and 20, no particular geometric placement of nodes is ever implied The relationship between graph algorithms and geometric problems is discussed in further detail in Chapter 31 For now, we’ll concentrate on “pure” graph algorithms which process simple collections
of edges and nodes
A path from vertex x to y in a graph is a list of vertices in which successive vertices are connected by edges in the graph For example, BAFEG is a path from B to G in the graph above A graph is connected if there is a path from every node to every other node in the graph Intuitively, if the vertices were physical objects and the edges were strings connecting them, a connected graph would stay in one piece if picked up by any vertex A graph which is not connected is made up of connected components; for example, the graph
drawn above has three connected components A simple path is a path in
which no vertex is repeated (For example, BAFEGAC is not a simple path.)
A cycle is a path which is simple except that the first and last vertex are the same (a path from a point back to itself): the path AFEGA is a cycle
A graph with no cycles is called a tree There is only one path between
any two nodes in a tree (Note that binary trees and other types of trees that we’ve built with algorithms are special cases included in this general definition
of trees.) A group of disconnected trees is called a forest A spanning tree of a graph is a subgraph that contains all the vertices but only enough of the edges
to form a tree For example, below is a spanning tree for the large component
of our sample graph
Note that if we add any edge to a tree, it must form a cycle (because there is already a path between the two vertices that it connects) Also, it is easy to prove by induction that a tree on V vertices has exactly V - 1 edges.
Trang 4If a graph with V vertices has less than V - 1 edges, it can’t be connected.
If it has more that V - 1 edges, it must have a cycle (But if it has exactly
V - 1 edges, it need not be a tree.)
We’ll denote the number of vertices in a given graph by V, the number
of edges by E Note that E can range anywhere from 0 to $V(V - 1) Graphs
with all edges present are called complete graphs; graphs with relatively few edges (say less than Vlog V) are called sparse; graphs with relatively few of
the possible edges missing are called dense.
This fundamental dependence on two parameters makes the comparative study of graph algorithms somewhat more complicated than many algorithms that we’ve studied, because more possibilities arise For example, one algo-rithm might take about V2 steps, while another algorithm for the same
prob-lem might take (E + V) log E steps The second algorithm would be better for
sparse graphs, but the first would be preferred for dense graphs
Graphs as defined to this point are called undirected graphs, the simplest type of graph We’ll also be considering more complicated type of graphs, in which more information is associated with the nodes and edges In weighted graphs integers (weights) are assigned to each edge to represent, say, distances
or costs In directed graphs , edges are “one-way”: an edge may go from x to
y but not the other way Directed weighted graphs are sometimes called
net-works As we’ll discover, the extra information weighted and directed graphs
contain makes them somewhat more difficult to manipulate than simple un-directed graphs
Representation
In order to process graphs with a computer program, we first need to decide how to represent them within the computer We’ll look at two commonly used representations; the choice between them depends whether the graph is dense
or sparse
The first step in representing a graph is to map the vertex names to integers between 1 and V The main reason for doing this is to make it
possible to quickly access information corresponding to each vertex, using array indexing Any standard searching scheme can be used for this purpose: for instance, we can translate vertex names to integers between 1 and V
by maintaining a hash table or a binary tree which can be searched to find the integer corresponding to any given vertex name Since we have already studied these techniques, we’ll assume that we have available a function index
to convert from vertex names to integers between 1 and V and a function name
to convert from integers to vertex names In order to make the algorithms easy
to follow, our examples will use one-letter vertex names, with the ith letter
of the alphabet corresponding to the integer i Thus, though name and index
Trang 5are trivial to implement for our examples, their use makes it easy to extend
the algorithms to handle graphs with real vertex names using techniques from Chapters 14-17
The most straightforward representation for graphs is the so-called
ad-jacenc y matrix representation A V-by-V array of boolean values is
main-tained, with a[x, y] set to true if there is an edge from vertex x to vertex y and false otherwise The adjacency matrix for our example graph is given below
B1100000000000
c1010000000000
D0001110000000
E0001111000000
F1001110000000
G1000101000000
H0000000110000
J0000000001111
L0000000001011
M0000000001011
Notice that each edge is really represented by two bits: an edge connecting
x and y is represented by true values in both a[x, y] and a[y, x] While it
is possible to save space by storing only half of this symmetric matrix, it
is inconvenient to do so in Pascal and the algorithms are somewhat simpler with the full matrix Also, it’s sometimes convenient to assume that there’s
an “edge” from each vertex to itself, so a[x, x] is set to 1 for x from 1 to V
A graph is defined by a set of nodes and a set of edges connecting them
To read in a graph, we need to settle on a format for reading in these sets The obvious format to use is first to read in the vertex names and then read
in pairs of vertex names (which define edges) As mentioned above, one easy way to proceed is to read the vertex names into a hash table or binary search tree and to assign to each vertex name an integer for use in accessing vertex-indexed arrays like the adjacency matrix The ith vertex read can be assigned the integer i (Also, as mentioned above, we’ll assume for simplicity in our examples that the vertices are the first V letters of the alphabet, so that we
can read in graphs by reading V and E, then E pairs of letters from the first
Trang 6V letters of the alphabet.) Of course, the order in which the edges appear is
not relevant All orderings of the edges represent the same graph and result
in the same adjacency matrix, as computed by the following program:
program adjmatrix(input, output);
const maxV=50;
var j, x, y, V, E: integer;
a: array[l maxV, l maxq of boolean;
begin
readln (V, E) ;
for x:=1 to Vdo
for y:=l to V do a[x, y] :=false;
for x:=1 to V do a[x, x] :=true;
for j:=l to E do
begin
readln (vl , v2) ;
x:=index(vl); y:=index(v2);
a[x,y]:=true; a[y,x]:=true
end ;
end
The types of vl and v2 are omitted from this program, as well as the code for index These can be added in a straightforward manner, depending on the graph input representation desired (For our examples, vl and v2 could be of type char and index a simple function which uses the Pascal ord function.)
The adjacency matrix representation is satisfactory only if the graphs
to be processed are dense: the matrix requires V2 bits of storage and V2
steps just to initialize it If the number of edges (the number of one bits
in the matrix) is proportional to V2, then this may be no problem because
about V2 steps are required to read in the edges in any case, but if the graph
is sparse, just initializing this matrix could be the dominant factor in the running time of an algorithm Also this might be the best representation for some algorithms which require more than V2 steps for execution Next we’ll
look at a representation which is more suitable for graphs which are not dense
In the adjacency structure representation all the vertices connected to each vertex are listed on an adjacency list for that vertex This can be easily accomplished with linked lists, as shown in the program below which builds the adjacency structure for our sample graph
Trang 7program adjlist(input, output);
const maxV= 1000;
t y p e link=fnode;
node=record v: integer; next: link end;
var j, x, y, V, E: integer;
t, z: link;
adj: array[I maxV] of link;
begin
readln (V, E) ;
new(z); zt.next:=z;
for j:=l to V do adjb] :=z;
for j:=l to E do begin
readln (vl , v2) ;
x:=index(vl); y:=index(v2);
n e w ( t ) ; tt.v:=x; tf.next:=adj[y]; adj[y]:=t;
n e w ( t ) ; tf.v:=y; tt.next:=adj[x]; adj[x]:=t;
end ; end.
(As usual, each linked list ends with a link to an artificial node z, which links to itself.) For this representation, the order in which the edges appear
in the input is quite relevant: it (along with the list insertion method used) determines the order in which the vertices appear on the adjacency lists Thus, the same graph can be represented in many different ways in an adjacency list structure Indeed, it is difficult to predict what the adjacency lists will look like by examining just the sequence of edges, because each edge involves insertions into two adjacency lists
The order in which edges appear on the adjacency list affects, in turn, the order in which edges are processed by algorithms That is, the adjacency list structure determines the way that various algorithms that we’ll be examining
“see” the graph While an algorithm should produce a correct answer no matter what the order of the edges on the adjacency lists, it might get to that answer by quite different sequences of computations for different orders And
if there is more than one “correct answer,” different input orders might lead
to different output results
If the edges appear in the order listed after the first drawing of our sample graph at the beginning of the chapter, the program above builds the following adjacency list structure:
Trang 8A: F C B G B: A
C: A D: F E E: G F D F: A E D G: E A H: I I: H J: K L M K: J
L: J M M: J L Note that again each edge is represented twice: an edge connecting x and
y is represented as a node containing x on y’s adjacency list and as a node containing y on x’s adjacency list It is important to include both, since otherwise simple questions like “Which nodes are connected directly to node x?” could not be answered efficiently
Some simple operations are not supported by this representation For example, one might want to delete a vertex, x, and all the edges connected to
it It’s not sufficient to delete nodes from the adjacency list: each node on the adjacency list specifies another vertex whose adjacency list must be searched for a node corresponding to x to be deleted This problem can be corrected by linking together the two list nodes which correspond to a particular edge and making the adjacency lists doubly linked Then if an edge is to be removed, both list nodes corresponding to that edge can be deleted quickly Of course, all these extra links are quite cumbersome to process, and they certainly shouldn’t be included unless operations like deletion are needed
Such considerations also make it plain why we don’t use a “direct” representation for graphs: a data structure which exactly models the graph, with vertices represented as allocated records and edge lists containing links
to vertices instead of vertex names How would one add an edge to a graph represented in this way?
Directed and weighted graphs are represented with similar structures For directed graphs, everything is the same, except that each edge is represented just once: an edge from x to y is represented by a true value in a [x, y] in the adjacency matrix or by the appearance of y on x’s adjacency list in the adjacency structure Thus an undirected graph might be thought of as a directed graph with directed edges going both ways between each pair of vertices connected by an edge For weighted graphs, everything again is the same except that we fill the adjacency matrix with weights instead of boolean
Trang 9values (using some non-existent weight to represent false), or we include a field for the edge weight in adjacency list records in the adjacency structure
It is often necessary to associate other information with the vertices
or nodes of a graph to allow it to model more complicated objects or to save bookkeeping information in complicated algorithms Extra information associated with each vertex can be accommodated by using auxiliary arrays indexed by vertex number (or by making adj an array of records in the adjacency structure representation) Extra information associated with each edge can be put in the adjacency list nodes (or in an array a of records in the adjacency matrix representation), or in auxiliary arrays indexed by edge number (this requires numbering the edges)
Depth-First Search
At the beginning of this chapter, we saw several natural questions that arise immediately when processing a graph Is the graph connected? If not, what are its connected components? Does the graph have a cycle? These and many other problems can be easily solved with a technique called depth-first search, which is a natural way to “visit” every node and check every edge in the graph systematically We’ll see in the chapters that follow that simple variations
on a generalization of this method can be used to solve a variety of graph problems
For now, we’ll concentrate on the mechanics of examining every piece
of the graph in an organized way Below is an implementation of depth-first search which fills in an array vaJ [l Vl as it visits every vertex of the graph.
The array is initially set to all zeros, so vaJ[k]=O indicates that vertex k has
not yet been visited The goal is to systematically visit all the vertices of the graph, setting the vaJ entry for the nowth vertex visited to now, for now=
1,2, , V The program uses a recursive procedure visit which visits all the
vertices in the same connected component as the vertex given in the argument
To visit a vertex, we check all its edges to see if they lead to vertices which haven’t yet been visited (as indicated by 0 vaJ entries); if so, we visit them:
Trang 10procedure dfs;
var now, k: integer;
val: array [l maxv] of integer;
procedure visit(k: integer);
var t: link;
begin
now:=now+l; val[k] :=now;
t:=adj[k];
while t<>z do
begin
if val[tt.v]=O then visit(tf.v);
t:=tf.next
end
end ;
begin
now:=O;
for k:=l to V do val[k] :=O;
for k:=l to V do
if val[k]=O then visit(k)
end ;
First visit is called for the first vertex, which results in nonzero val values being set for all the vertices connected to that vertex Then dfs scans through the vaJ array to find a zero entry (corresponding to a vertex that hasn’t been seen yet) and calls visit for that vertex, continuing in this way until all vertices have been visited
The best way to follow the operation of depth-first search is to redraw the graph as indicated by the recursive calls during the visit procedure This gives the following structure
8H
9
8I
Vertices in this structure are numbered with their val values: the vertices are