Thuật toán Algorithms (Phần 39)

A path from vertex x to y in a graph is a list of vertices in which successive vertices are connected by edges in the graph.. In the adjacency structure representation all the vertices c

Trang 1

A great many problems are naturally formulated in terms of objects and connections between them For example, given an airline route map of the eastern U S., we might be interested in questions like: “What’s the fastest way to get from Providence to Princeton?” Or we might be more interested in money than in time, and look for the cheapest way to get from Providence to Princeton To answer such questions we need only information about interconnections (airline routes) between objects (towns)

Electric circuits are another obvious example where interconnections be-tween objects play a central role Circuit elements like transistors, resistors, and capacitors are intricately wired together Such circuits can be represented and processed within a computer in order to answer simple questions like “Is everything connected together?” as well as complicated questions like “If this circuit is built, will it work?” In this case, the answer to the first ques-tion depends only on the properties of the interconnecques-tions (wires), while the answer to the second question requires detailed information about both the wires and the objects that they connect

A third example is “job scheduling,” where the objects are tasks to be performed, say in a manufacturing process, and interconnections indicate which jobs should be done before others Here we might be interested in answering questions like “When should each task be performed?”

A graph is a mathematical object which accurately models such situations.

In this chapter, we’ll examine some basic properties of graphs, and in the next several chapters we’ll study a variety of algorithms for answering questions of the type posed above

Actually, we’ve already encountered graphs in several instances in pre-vious chapters Linked data structures are actually representations of graphs, and some of the algorithms that we’ll see for processing graphs are similar to algorithms that we’ve already seen for processing trees and other structures

373

Trang 2

For example, the finite-state machines of Chapters 19 and 20 are represented with graph structures

Graph theory is a major branch of combinatorial mathematics and has been intensively studied for hundreds of years Many important and useful properties of graphs have been proved, but many difficult problems have yet to

be resolved We’ll be able here only to scratch the surface of what is known about graphs, covering enough to be able to understand the fundamental algorithms

As with so many of the problem domains that we’ve studied, graphs have only recently begun to be examined from an algorithmic point of view Although some of the fundamental algorithms are quite old, many of the interesting ones have been discovered within the last ten years Even trivial graph algorithms lead to interesting computer programs, and the nontrivial algorithms that we’ll examine are among the most elegant and interesting (though difficult to understand) algorithms known

Glossary

A good deal of nomenclature is associated with graphs Most of the terms have straightforward definitions, and it is convenient to put them in one place even though we won’t be using some of them until later

A graph is a collection of vertices and edges Vertices are simple objects

which can have names and other properties; an edge is a, connection between two vertices One can draw a graph by marking points for the vertices and drawing lines connecting them for the edges, but it must be borne in mind that the graph is defined independently of the representation For example, the following two drawings represent the same graph:

We define this graph by saying that it consists of the set of vertices A B C D

E F G H I J K L M and the set of edges between these vertices AG Al3 AC

L M J M J L J K E D F D H I F E A F ’ G E

Trang 3

For some applications, such as the airline route example above, it might not make sense to rearrange the placement of the vertices as in the diagrams above But for some other applications, such as the electric circuit application above, it is best to concentrate only on the edges and vertices, independent

of any particular geometric placement And for still other applications, such

as the finite-state machines in Chapters 19 and 20, no particular geometric placement of nodes is ever implied The relationship between graph algorithms and geometric problems is discussed in further detail in Chapter 31 For now, we’ll concentrate on “pure” graph algorithms which process simple collections

of edges and nodes

A path from vertex x to y in a graph is a list of vertices in which successive vertices are connected by edges in the graph For example, BAFEG is a path from B to G in the graph above A graph is connected if there is a path from every node to every other node in the graph Intuitively, if the vertices were physical objects and the edges were strings connecting them, a connected graph would stay in one piece if picked up by any vertex A graph which is not connected is made up of connected components; for example, the graph

drawn above has three connected components A simple path is a path in

which no vertex is repeated (For example, BAFEGAC is not a simple path.)

A cycle is a path which is simple except that the first and last vertex are the same (a path from a point back to itself): the path AFEGA is a cycle

A graph with no cycles is called a tree There is only one path between

any two nodes in a tree (Note that binary trees and other types of trees that we’ve built with algorithms are special cases included in this general definition

of trees.) A group of disconnected trees is called a forest A spanning tree of a graph is a subgraph that contains all the vertices but only enough of the edges

to form a tree For example, below is a spanning tree for the large component

of our sample graph

Note that if we add any edge to a tree, it must form a cycle (because there is already a path between the two vertices that it connects) Also, it is easy to prove by induction that a tree on V vertices has exactly V - 1 edges.

Trang 4

If a graph with V vertices has less than V - 1 edges, it can’t be connected.

If it has more that V - 1 edges, it must have a cycle (But if it has exactly

V - 1 edges, it need not be a tree.)

We’ll denote the number of vertices in a given graph by V, the number

of edges by E Note that E can range anywhere from 0 to $V(V - 1) Graphs

with all edges present are called complete graphs; graphs with relatively few edges (say less than Vlog V) are called sparse; graphs with relatively few of

the possible edges missing are called dense.

This fundamental dependence on two parameters makes the comparative study of graph algorithms somewhat more complicated than many algorithms that we’ve studied, because more possibilities arise For example, one algo-rithm might take about V2 steps, while another algorithm for the same

prob-lem might take (E + V) log E steps The second algorithm would be better for

sparse graphs, but the first would be preferred for dense graphs

Graphs as defined to this point are called undirected graphs, the simplest type of graph We’ll also be considering more complicated type of graphs, in which more information is associated with the nodes and edges In weighted graphs integers (weights) are assigned to each edge to represent, say, distances

or costs In directed graphs , edges are “one-way”: an edge may go from x to

y but not the other way Directed weighted graphs are sometimes called

net-works As we’ll discover, the extra information weighted and directed graphs

contain makes them somewhat more difficult to manipulate than simple un-directed graphs

Representation

In order to process graphs with a computer program, we first need to decide how to represent them within the computer We’ll look at two commonly used representations; the choice between them depends whether the graph is dense

or sparse

The first step in representing a graph is to map the vertex names to integers between 1 and V The main reason for doing this is to make it

possible to quickly access information corresponding to each vertex, using array indexing Any standard searching scheme can be used for this purpose: for instance, we can translate vertex names to integers between 1 and V

by maintaining a hash table or a binary tree which can be searched to find the integer corresponding to any given vertex name Since we have already studied these techniques, we’ll assume that we have available a function index

to convert from vertex names to integers between 1 and V and a function name

to convert from integers to vertex names In order to make the algorithms easy

to follow, our examples will use one-letter vertex names, with the ith letter

of the alphabet corresponding to the integer i Thus, though name and index

Trang 5

are trivial to implement for our examples, their use makes it easy to extend

the algorithms to handle graphs with real vertex names using techniques from Chapters 14-17

The most straightforward representation for graphs is the so-called

ad-jacenc y matrix representation A V-by-V array of boolean values is

main-tained, with a[x, y] set to true if there is an edge from vertex x to vertex y and false otherwise The adjacency matrix for our example graph is given below

B1100000000000

c1010000000000

D0001110000000

E0001111000000

F1001110000000

G1000101000000

H0000000110000

J0000000001111

L0000000001011

M0000000001011

Notice that each edge is really represented by two bits: an edge connecting

x and y is represented by true values in both a[x, y] and a[y, x] While it

is possible to save space by storing only half of this symmetric matrix, it

is inconvenient to do so in Pascal and the algorithms are somewhat simpler with the full matrix Also, it’s sometimes convenient to assume that there’s

an “edge” from each vertex to itself, so a[x, x] is set to 1 for x from 1 to V

A graph is defined by a set of nodes and a set of edges connecting them

To read in a graph, we need to settle on a format for reading in these sets The obvious format to use is first to read in the vertex names and then read

in pairs of vertex names (which define edges) As mentioned above, one easy way to proceed is to read the vertex names into a hash table or binary search tree and to assign to each vertex name an integer for use in accessing vertex-indexed arrays like the adjacency matrix The ith vertex read can be assigned the integer i (Also, as mentioned above, we’ll assume for simplicity in our examples that the vertices are the first V letters of the alphabet, so that we

can read in graphs by reading V and E, then E pairs of letters from the first

Trang 6

V letters of the alphabet.) Of course, the order in which the edges appear is

not relevant All orderings of the edges represent the same graph and result

in the same adjacency matrix, as computed by the following program:

program adjmatrix(input, output);

const maxV=50;

var j, x, y, V, E: integer;

a: array[l maxV, l maxq of boolean;

begin

readln (V, E) ;

for x:=1 to Vdo

for y:=l to V do a[x, y] :=false;

for x:=1 to V do a[x, x] :=true;

for j:=l to E do

begin

readln (vl , v2) ;

x:=index(vl); y:=index(v2);

a[x,y]:=true; a[y,x]:=true

end ;

end

The types of vl and v2 are omitted from this program, as well as the code for index These can be added in a straightforward manner, depending on the graph input representation desired (For our examples, vl and v2 could be of type char and index a simple function which uses the Pascal ord function.)

The adjacency matrix representation is satisfactory only if the graphs

to be processed are dense: the matrix requires V2 bits of storage and V2

steps just to initialize it If the number of edges (the number of one bits

in the matrix) is proportional to V2, then this may be no problem because

about V2 steps are required to read in the edges in any case, but if the graph

is sparse, just initializing this matrix could be the dominant factor in the running time of an algorithm Also this might be the best representation for some algorithms which require more than V2 steps for execution Next we’ll

look at a representation which is more suitable for graphs which are not dense

In the adjacency structure representation all the vertices connected to each vertex are listed on an adjacency list for that vertex This can be easily accomplished with linked lists, as shown in the program below which builds the adjacency structure for our sample graph

Trang 7

program adjlist(input, output);

const maxV= 1000;

t y p e link=fnode;

node=record v: integer; next: link end;

var j, x, y, V, E: integer;

t, z: link;

adj: array[I maxV] of link;

begin

readln (V, E) ;

new(z); zt.next:=z;

for j:=l to V do adjb] :=z;

for j:=l to E do begin

readln (vl , v2) ;

x:=index(vl); y:=index(v2);

n e w ( t ) ; tt.v:=x; tf.next:=adj[y]; adj[y]:=t;

n e w ( t ) ; tf.v:=y; tt.next:=adj[x]; adj[x]:=t;

end ; end.

(As usual, each linked list ends with a link to an artificial node z, which links to itself.) For this representation, the order in which the edges appear

in the input is quite relevant: it (along with the list insertion method used) determines the order in which the vertices appear on the adjacency lists Thus, the same graph can be represented in many different ways in an adjacency list structure Indeed, it is difficult to predict what the adjacency lists will look like by examining just the sequence of edges, because each edge involves insertions into two adjacency lists

The order in which edges appear on the adjacency list affects, in turn, the order in which edges are processed by algorithms That is, the adjacency list structure determines the way that various algorithms that we’ll be examining

“see” the graph While an algorithm should produce a correct answer no matter what the order of the edges on the adjacency lists, it might get to that answer by quite different sequences of computations for different orders And

if there is more than one “correct answer,” different input orders might lead

to different output results

If the edges appear in the order listed after the first drawing of our sample graph at the beginning of the chapter, the program above builds the following adjacency list structure:

Trang 8

A: F C B G B: A

C: A D: F E E: G F D F: A E D G: E A H: I I: H J: K L M K: J

L: J M M: J L Note that again each edge is represented twice: an edge connecting x and

y is represented as a node containing x on y’s adjacency list and as a node containing y on x’s adjacency list It is important to include both, since otherwise simple questions like “Which nodes are connected directly to node x?” could not be answered efficiently

Some simple operations are not supported by this representation For example, one might want to delete a vertex, x, and all the edges connected to

it It’s not sufficient to delete nodes from the adjacency list: each node on the adjacency list specifies another vertex whose adjacency list must be searched for a node corresponding to x to be deleted This problem can be corrected by linking together the two list nodes which correspond to a particular edge and making the adjacency lists doubly linked Then if an edge is to be removed, both list nodes corresponding to that edge can be deleted quickly Of course, all these extra links are quite cumbersome to process, and they certainly shouldn’t be included unless operations like deletion are needed

Such considerations also make it plain why we don’t use a “direct” representation for graphs: a data structure which exactly models the graph, with vertices represented as allocated records and edge lists containing links

to vertices instead of vertex names How would one add an edge to a graph represented in this way?

Directed and weighted graphs are represented with similar structures For directed graphs, everything is the same, except that each edge is represented just once: an edge from x to y is represented by a true value in a [x, y] in the adjacency matrix or by the appearance of y on x’s adjacency list in the adjacency structure Thus an undirected graph might be thought of as a directed graph with directed edges going both ways between each pair of vertices connected by an edge For weighted graphs, everything again is the same except that we fill the adjacency matrix with weights instead of boolean

Trang 9

values (using some non-existent weight to represent false), or we include a field for the edge weight in adjacency list records in the adjacency structure

It is often necessary to associate other information with the vertices

or nodes of a graph to allow it to model more complicated objects or to save bookkeeping information in complicated algorithms Extra information associated with each vertex can be accommodated by using auxiliary arrays indexed by vertex number (or by making adj an array of records in the adjacency structure representation) Extra information associated with each edge can be put in the adjacency list nodes (or in an array a of records in the adjacency matrix representation), or in auxiliary arrays indexed by edge number (this requires numbering the edges)

Depth-First Search

At the beginning of this chapter, we saw several natural questions that arise immediately when processing a graph Is the graph connected? If not, what are its connected components? Does the graph have a cycle? These and many other problems can be easily solved with a technique called depth-first search, which is a natural way to “visit” every node and check every edge in the graph systematically We’ll see in the chapters that follow that simple variations

on a generalization of this method can be used to solve a variety of graph problems

For now, we’ll concentrate on the mechanics of examining every piece

of the graph in an organized way Below is an implementation of depth-first search which fills in an array vaJ [l Vl as it visits every vertex of the graph.

The array is initially set to all zeros, so vaJ[k]=O indicates that vertex k has

not yet been visited The goal is to systematically visit all the vertices of the graph, setting the vaJ entry for the nowth vertex visited to now, for now=

1,2, , V The program uses a recursive procedure visit which visits all the

vertices in the same connected component as the vertex given in the argument

To visit a vertex, we check all its edges to see if they lead to vertices which haven’t yet been visited (as indicated by 0 vaJ entries); if so, we visit them:

Trang 10

procedure dfs;

var now, k: integer;

val: array [l maxv] of integer;

procedure visit(k: integer);

var t: link;

begin

now:=now+l; val[k] :=now;

t:=adj[k];

while t<>z do

begin

if val[tt.v]=O then visit(tf.v);

t:=tf.next

end

end ;

begin

now:=O;

for k:=l to V do val[k] :=O;

for k:=l to V do

if val[k]=O then visit(k)

end ;

First visit is called for the first vertex, which results in nonzero val values being set for all the vertices connected to that vertex Then dfs scans through the vaJ array to find a zero entry (corresponding to a vertex that hasn’t been seen yet) and calls visit for that vertex, continuing in this way until all vertices have been visited

The best way to follow the operation of depth-first search is to redraw the graph as indicated by the recursive calls during the visit procedure This gives the following structure

8H

9

8I

Vertices in this structure are numbered with their val values: the vertices are

Tiêu đề	Elementary Graph Algorithms
Trường học	Standard University
Chuyên ngành	Computer Science
Thể loại	Bài báo
Năm xuất bản	2023
Thành phố	City Name

Định dạng
Số trang	10
Dung lượng	82,36 KB