Parallel maximum independent set in convex bipartite graphs

A bipartite graph G = (VW E) is called convex if the vertices in W can be ordered in such a way that the elements of W adjacent to any vertex u E V form an interval (i.e. a sequence consecutively numbered vertices). Such a graph can be represented in a compact form that requires O(n) space, where n = max{ IVI, 1WI}. G iven a convex bipartite graph G in the compact form Dekel and Sahni designed an 0( log* (n) )-time, n-processor EREW PRAM algorithm to compute a maximum matching in G. We show that the matching produced by their algorithm can be used to construct optimally in parallel a maximum set of independent vertices. Our algorithm runs in 0( logn) time with n/ logn processors on an Arbitrary CRCW PRAM. Keywords: Bipartite graphs; Convex graphs; Independent set; PRAM algorithms 1. Introduction An independent set of a graph is a subset of its vertices such that no two vertices in the subset are adjacent. The problem of finding a maximum cardinality independent set (or shortly, the MIS problem) is one of the most fundamental problems in graph theory. If there are no restrictions on the input graph the MIS problem is known to be NP-complete. However, in the case of bipartite graphs the MIS problem is closely * Corresponding author. Email: artur@uni-paderbom.de. Sup- ported in part by DFG-Graduiertenkolleg “Parallele Rechnemet- zwerke in der Produktionstechnik”. ME 872/4- 1. t Email: diks@mimuw.edu.pl. Partly supported by EC Cooper- ative Action K-1000 (project ALTEC: Algorithms for Future Technologies). * Email: przytyck@cs.umbs.edu. related to a maximum matching problem and hence it can be solved in polynomial time [ 61. A subset M of edges of a graph G = (YE) is a matching if no two edges in M are incident to the same vertex; A4 is of maximum cardinality (or simply, a maximum matching) if it contains the maximum number of edges. The problem of finding a maximum

Trang 1

Information Processing Letters 59 ( 1996) 289-294

Artur Czumaj a,*, Krzysztof Diks b*l, Teresa M PrzytyckaCq2

a Heinz Nixdorflnstitute and Department of M athematics & Computer Science, Universiv of Paderbom, D- 33095 Paderborn, Germany zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

h Insty tut Informaty ki Uniw ersy tet W arszaw ski, PL- 02- 097 W arszaw a Poland

’ Department of Computer Science, University of M ary land, A.Y W illiams Bldg., College Park, M D 20742, USA

Received 20 January 1995; revised 19 August 1996 Communicated by M.J Atallah zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

Abstract

A bipartite graph G = (VW E) is called convex if the vertices in W can be ordered in such a way that the elements of

W adjacent to any vertex u E V form an interval (i.e a sequence consecutively numbered vertices) Such a graph can be

represented in a compact form that requires O(n) space, where n = max{ IVI, 1 WI} G iven a convex bipartite graph G in the

compact form Dekel and Sahni designed an 0( log* (n) )-time, n-processor EREW PRAM algorithm to compute a maximum

matching in G We show that the matching produced by their algorithm can be used to construct optimally in parallel a

maximum set of independent vertices Our algorithm runs in 0( logn) time with n/ logn processors on an Arbitrary CRCW

PRAM

Key words: Bipartite graphs; Convex graphs; Independent set; PRAM algorithms

1 Introduction

An independent set of a graph is a subset of its ver-

tices such that no two vertices in the subset are adja-

cent The problem of finding a maximum cardinality

independent set (or shortly, the MIS problem) is one

of the most fundamental problems in graph theory If

there are no restrictions on the input graph the MIS

problem is known to be NP-complete However, in

the case of bipartite graphs the MIS problem is closely

* Corresponding author Email: artur@uni-paderbom.de Sup-

ported in part by DFG-Graduiertenkolleg “Parallele Rechnemet-

zwerke in der Produktionstechnik” ME 872/4- 1

t Email: diks@mimuw.edu.pl Partly supported by EC Cooper-

ative Action K-1000 (project ALTEC: Algorithms for Future

Technologies)

* Email: przytyck@cs.umbs.edu

related to a maximum matching problem and hence it can be solved in polynomial time [ 61

A subset M of edges of a graph G = (YE) is a

matching if no two edges in M are incident to the same vertex; A4 is of maximum cardinality (or simply,

a maximum matching) if it contains the maximum number of edges The problem of finding a maximum cardinality matching is called the maximum matching problem

In this paper we address the problem of finding in parallel a maximum independent set in a special class

of graphs - convex bipartite graphs

Let G = (Vu! E) be an undirected bipartite graph,

where Y W are sets of vertices and E is a set of edges

of the form (u, w), with u E V and w E W The graph

G is convex if there is an ordering “<” of the elements

of W such that the vertices of W connected to any u E

Trang 2

290 zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBAA Czumaj et al./Information Processing Letters 59 (1996) 289-294

WI

w3

W4

Fig 1 A convex bipartite graph and its compact representation

V form an interval in this ordering, i.e for any u E V

and WI, w, w2 E W such that wi < w < ~2,

(u,wi) E Eand (u,w2) E E + (u,w) E E

Let n = max{ 1 VI, IWl} The number n is called size

of G Without loss of generality we will consider only

graphs without isolated vertices

Convex bipartite graphs were originally discussed

by Glover [ 71 and next by Lipski and Preparata [ 91

It is typical for applications involving convex bipartite

graphs that the graph G = (VW E) is given by spec-

ifying the ordering “<” and by specifying the end-

points beg(u) and end(u) of the interval of the ele-

ments of W connected to U, for every u E V Observe

that the size of such representation does not depend

on the number of edges in the graph We call this rep-

resentation compact If additionally the vertices of V

are ordered with respect to the end values then such

representation is called the sorted compact representa-

tion Note that the sorted compact representation can

be obtained from the convex representation by integer

sorting

An example of a convex graph and its sorted com-

pact representation are given in Fig 1

Given the sorted compact representation of a convex

graph G = (VU: E), Dekel and Sahni [ 4 ] designed

an 0( lo?( n) )-time, n-processor EREW PRAM al-

gorithm to compute a maximum matching in G The

algorithm of Dekel and Sahni is an example of a paral-

lel greedy algorithm It produces the (greedy) match-

ing M which has the following properties (recall that

both sets V, W are ordered):

l the smallest vertex in V is matched with its smallest

neighbor in W;

l if u E V is not the smallest vertex and it is matched

in M with a vertex w E W (i.e (u, w) E M) then

w is the smallest vertex among neighbors of u in W

not matched by any vertex u E V smaller than U

In a case of convex bipartite graphs such a matching

is a maximum one

In the sequential setting both, a maximum matching and a maximum independent set in a convex bipartite graph, can be computed in linear time [ 591

In this paper, we show that given the greedy matching in a convex bipartite graph G one can compute a maximum independent set in G in time O(logn) and with n/log n CRCW processors, where IZ is the size

of the input graph

To this end we give a parallel implementation of the following well known algorithm for computing a maximum independent set in a bipartite graph G =

Algorithm MIS

Direct every edge e E M from W tb V and every edgeeEE\MfromVtoW

Let VO be the set of unmatched vertices in V, find the sets VI 2 V and WI G W of vertices reachable from VO (VI includes VO)

Construct the maximum independent set as I = KU(W\Wl>

Thus the problem is reduced to finding all the vertices in G which are reachable from the unmatched vertices in V In parallel setting, this problem (for general bipartite graphs) falls into the group of problems with so called “transitiveclosure bottleneck” [ 81

However, if the input graph is convex and represented

in the sorted compact form and if the matching M

is the greedy one then, as we show, our reachability problem has an interesting structure allowing us to design an optimal parallel algorithm for the maximum independent set problem

2 The algorithm

Consider a convex bipartite graph G = ( YK E)

given in the sorted compact form For simplicity of further considerations we assume that both V and W are given as the sequences of integers 1 to /VI and 1 to

1 WI, respectively Let M be the greedy matching in G

Direct every edge e E M from W to V and every edge e E E \ M from V to W (We do not assign the

Trang 3

/Informaiion 291

orientation to every single edge, as this would require

0( [RI) work, but simply add the corresponding infor-

mation to the compact representation of the graph.) A

directed edge going from a vertex u to a vertex w will

be denoted by u -+ w

For a set X 2 V zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBAU W let N(X) denote the set of

“outgoing” neighbors of the vertices in X, i.e

N(X) = {w 1 u E X and u f w

is a directed edge in G}

For every integer k > 1 let Rk( X) be the set of

vertices defined as follows:

Rk(X) =

N(N(Rk-l(X)>> U Rk-‘(X), k > 1

stead of N( {i}), Rk( {i}), R( {i}), respectively

Let R(X) denote the set of all vertices reachable

from X in an odd number of steps Observe that if u is

reachable from X then it is reachable in at most n - 1

steps Thus R(X) = Ur’J’2’ Rk(X) k-l

For simplicity we will write N(i), Rk (i), R(i) in-

Let VO be the set of all unmatched vertices in V Our

goal, as pointed in Introduction, is to compute the sets

WI = R( Vo) and VI = N( WI ) It is sufficient to show

how to compute R( VO) - the set of all vertices in W

reachable from the unmatched vertices in V (Since

easily compute VI from WI and M in constant time

with n processors.)

The basic idea of our approach is as follows First,

we show that for each i E Vi, R(i) is an interval

whose second endpoint is end(i) The first endpoints

for all i can be computed in O( 1) time with n proces-

sors Then, given a sequence of intervals sorted with

respect end values, we show how to compute the rep-

resentation of the union of these intervals as a union

of disjoint intervals With this representation, we can

decide if i E R( VO), for all vertex i E W, in constant

time with linear number of processors

In order to present our algorithm precisely we need

the following lemmas (Recall that the graph is given

in the sorted compact form.)

Lemma 1 For every i E VO and every integer k 3 1

the elements of Rk (i) form un interval [ bk (i) , ek (i) ]

in H!

Proof The proof is by induction on k

Ifk= 1 thenR’(i) = [b’(i),e’(i)],whereb’(i) =

beg(i) and e’(i) = end(i) Assume that k > 1 and that the lemma holds for k - 1,

Let Rk-’ (i) = [bk-‘(i),ek-l(i)] For j E

N( Rk-’ (i)) let m(j) be the vertex for which (j,m(j)) E M Then N(j) = EbegW,end(j)l \

{m(j)} Thus by h t e inductive hypothesis, it follows

that Rk(i) = [bk(i),ek(i)], where

bk(i> =min({bk-l(i)}

U {beg(j) 1 j E N(Rk-‘(i))}), and

ek(i> =max({ek-l(i)}

U{end(j) 1 j E N(Rk-l(i))}) 0

Lemma2 Let i c hand k 2 1 rfj E N(Rk(i))

then j < i

Proof Induction on k

k = 1: If j E N( R’ (i) > then there is a unique vertex k E R’(i) such that (j, k) l M Since i is unmatched and M is the greedy matching then j < i

k > 1: Assume that the lemma holds for k - 1 Consider the set N( Rk( i) \ Rk-’ (i) ) If this set is empty then the lemma holds for k obviously Suppose that it is non-empty and let j E N( Rk (i) \ Rk-’ (i) )

Then there are p E Rk-‘(i),q E N(Rk-‘(i)) and

r E Rk(i) \ Rk-‘(i) such that (q,p) E M,(q,r) E

E \ M and (j, r) E M By the induction hypothesis,

q < i and therefore end(q) < end(i) Since Rk-’ (i)

is an interval in W, thus r < p This observation and

the fact that M is the greedy matching imply j < q, otherwise the edge (q, r) would belong to M instead

of (4,~) and (j,r) El

Lemma3 Zfi E VO then R(i) = [b(i),e(i)],where b(i) = br”/21(i) and e(i) = end(i)

Proof The lemma follows immediately from Lem-

mas 1,2 and the fact that end(j) < end(i), for every

j<i Cl

Our next step is to show how to compute b(i) for every i E VO (recall that e(i) =end(i))

Trang 4

292 A Czumaj et al/Information Processing Letters 59 (1996) 289-294

For every i E V let Q(i) be the set of all its neigh-

bors (not only the outgoing ones) matched in M and

letq(i) beavertexinN(Q(i)) suchthatbeg(q(i)) =

min({beg(j> / j E N(Q(i)))) Define a function

next : v -+ v as follows

next(i) = 1,

i

beg(i) G b&q(i)),

q(i), otherwise

Informally, next( i) is the vertex of V with the smallest

beg value that can be reached from the vertex i in at

most 2 steps via its matched neighbors in W

Let next0(i) = i,ne&(i) = next(ne&‘(i)),

for every k 2 1, and let next*(i) = j be such that

ne_xt( j) = j and nextk (i) = j, for some k 2 0 Ob-

serve that if i # next(i) then beg(next(i)) < beg(i)

Thus pointers next are the parent pointers of a rooted

forest Furthermore, the function next has the follow-

Lemma 4 For every i E V, and every integer k 3 1

Rk(i) = [beg(ne&‘(i)),end(i)]

Proof The proof is by induction on k

Since nat”( i) = i and R’(i) = [beg(i),end(i)]

the lemma holds for k = 1

Assume that k > 1 and the lemma holds for every

positive integer 1 < k

By the induction hypothesis,

Rk-‘(i) = [beg(nextk-*(i)),end(i)]

It follows from the definition of the function next

that Rk(i) > [beg(ne&-](i)),end(i)] Suppose

that there is j E Rk(i) \ [beg(nextk-‘(i)),end(i)]

Then there are p E Rk-‘(i),t E N(Rk-l(i)) such

that (t,p) E M and (t, j) E E \ M Notice that

beg(t) < j < beg(ne&‘(i)) Let ko 6 k - 1 be

the smallest positive integer such that p E Rb (i) By

the induction hypothesis,

p E [beg(nextk0-‘(i)),end(i)l

and

p 4 [beg(ne&‘-*(i)),end(i)]

Then

beg(ne_&(i)) < beg(t) < j < beg(nex@‘(i))

Since

beg(ne&‘(i)) < beg(ne&(i))

for every k 2 0, we get a contradiction 0

Corollary 5 For every i E Vo

R(i) = [beg(next*(i)),end(i)]

If all next(j) are known then one can compute

next* (j) in 0( log n) time with n/ log n processors using the tree contraction technique We must be careful here, as tree contraction algorithms are usually pre- sented in the context of a tree where each internal node has associated with a list of its children Unfortunately

in our application this is not the case For complete- ness of the presentation we show, in the Appendix, a technique that allows to avoid this restriction

We concluded that computing of intkrvals R(i) re-

duces to computing function next Function next can

be computed in 0( log n) time with n/ log n processors as follows

LetW= (WI, , wlMl ) be the increasing sequence

of the vertices of W matched in M W is easy to compute using the prefix computation Moreover, using the prefix computations one can easily compute two tables A[ l lWl] and C[l lWJ] such that for every w E W,

A[ w] is the largest index j such that Wj < w and

C [ w] is the smallest index j such that Wj > w For ev-

ery i E V let a(i) and c(i) be the smallest and largest indices such that Wa(i) 2 beg(i) and WC(i) < end(i) Indices a(i) and c(i) can be computed with a linear work using tables A and C Observe now that q(i) is a vertex with the minimum beg value among all vertices

in V matched with vertices Wa(i), Wa(i)+l, , WC(i)

In order to compute q(i) (and hence next(i) ) , for all

i E V, one can apply the algorithm for the range min-

imum searching problem [ 21 It takes 0( log n) time

on an n/ log n-processor CREW PRAM

Thus, we know how to compute a representation of

R( VO) as a union of n intervals sorted with respect to

the second endpoint Our final step is to simplify this representation to the union of non-intersecting intervals

Lemma6 Let II, ZP where Zi = [ bi, ei] be the sequence of intervals such thatfor i < j, ei 6 ej Then, the set of intervals Z{, , Z: such that for any i # j

Trang 5

A czumaj et al /Information Processing Letters 59 (1996) 289-294 293

zi’nz~=0andz~Uz~ Uz,=z~Uz~ Uz~cunbe

computed in 0( log n) time with n/ log n processors zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA Proof First, we eliminate every interval Zj such that

Zj is contained in an interval Zf with j’ > j To find

these intervals we consider the sequence of the first

endpoints of the intervals II, , Zp For each element

in this sequences we find closest dominating (i.e not

larger than the given element) successor If such a

successor exists, then the interval is eliminated This

can be done in O(logn) time with n/ logn proces-

sors [ 21 In this way we obtain a sequence of intervals

sorted with respect to both endpoints Now, the inter-

vals Ii, Z: can be computed using the list ranking

technique 0

Thus we can conclude the paper with the following

theorem

Theorem 7 Given a (sorted, compact representa-

tion of) convex bipartite graph G = ( Y W, E) of size n

and the greedy matching M one can compute a maxi-

mum independent set of vertices in G in 0( log n) time

using n/ log n processors of a CRCW PRAM

Appendix A Trre contraction in the absence of

an Euler tour

In this Appendix, we show how to solve optimally

the rooting problem in a forest Given a forest F de-

fined by the parent’s relation (i.e., each node u has

a pointer p(v) to its parent) with nodes { 1, ,n}

A node u is a root if p(u) = u The rooting problem

is to find for each node u the root r(u) of the tree it

belongs to

Given an Euler tour of each tree (or given for every

node the list of its children), the rooting problem can

be solved in 0( log n) time with n/ log n processors us-

ing standard tree-contraction algorithms [ 10,l ] How-

ever this technique cannot deal with unbounded de-

gree trees when no Euler tour is given We show an

approach that circumvents this assumption and design

an 0( logn)-time algorithm for the rooting problem

that employs O(n) operations

Our algorithm consists of three phases First, we

reduce the problem of size n to the problem of size

n/ log n Then we solve the smaller problem using

the non-optimal algorithm of Miller and Reif [lo], and finally combine the information from the smaller problem to all other nodes in the forest

A.Z TheJirst phase The first phase reduces the number of vertices in the

forest to at most n/ log n Let P = n/ log n, A be the

array containing all the nodes of F, B an empty array

of length n, and for each node u, succ( u) = p(u)

We repeat the following process until the size of A

is smaller than n/log n

Finding all leaves and chains Split the nodes in A

into three groups: (i) the leaves L, that is, all nodes v

E A that for no u E A, succ( u) = u; (ii) the nodes on chains C, that is, the nodes u E A - L such that for exactly one u E A, succ(u) = u; and (iii) the other nodes One can easily verify which node belongs to which of these sets in 0( 1 Al/P) time with P proces-

sors, and then, using the prefix sums algorithm of Cole and Vishkin [ 31, rearrange them to store in consecu- tivepartsofAintimeO(~A~/P+log~A~/loglog~A~)

with P processors

Remove all leaves If u E L then we set the pointer

to the node which will find the root of u, PT(u) = succ( u), and remove u from the array A All leaves are stored at the first free entries of an array B

Halves the chains C is a collection of lists Us- ing the algorithm of Cole and Vishkin [3], find a maximal independent set MIS in C in O(lCl/P +

log ICI / log log 1 C I ) time with P processors Addi- tionally we require that the last element from each list belongs to MIS

l For each node u E C - MIS, if succ(u) E MIS,

then PT( u) = succ( u), and otherwise PT( u) =

succ(succ(u)) Remove u from A and store all nodes from E C - MIS at the first free entries of B

l For each node from MIS that is not the last vertex on a chain, if succ(succ(u)) E MIS then set succ( u) = succ(succ( v)), and otherwise set succ(u) =succ(succ(succ(u)))

Fact 8 Phase 1 can be pelformed in O(logn) time with P = n/ log n processors

Trang 6

294 zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBAA Czumaj et al./Informa~ion Processing Letters 59 (1996) 289-294 zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

Proof Let N, denote the size of A before iteration t

Standard arguments (see e.g [ zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA10,l ] ) can be applied

to show that N,+I < $N, Hence N,+l 6 n(t)’ and

there are at most log log n iteration of the loop

The running time of iteration t is O( N,/P +

log N,/ log log NI) with P processors Summing this

together we get the running time bounded by

log log ”

log log N, >

El

log log n

Q + c O(N,) +loglognO

(

log Nr log log Nt >

r=l

A.2 The second phase

Now we perform standard tree-contraction algo-

rithm (e.g [ lo] ) for the forest defined by the relation

WCC in the array A It runs in 0( log N) time and uses

N processors, where N is the number of vertices in

the forest Since in our case N = n/ log n, this yields

to an 0( log n)-time n/ log n-processors algorithm

A.3 The third phase

In this step we have to combine the information

computed for the nodes that were left after Phase 1 to

obtain the pointers to the root for all the nodes in F

Observe that the nodes stored in B are ordered with

respect to the time when they were removed from A

This gives us a partition of B into blocks of nodes

that were removed at the same iteration Since they

are at most log log n blocks, we can analyze them suc-

cessively, one by one, in the reverse order of the time

when the nodes from given block were removed Then

using the information of the root of all the nodes in

the already analyzed part of B, w e can compute r(u)

using the pointer PT( 0) If Bi denotes the size of the

ith block, then each block can be analyzed in con-

stant time using Bi processors, or in O (Bi/P) time

using P processors Summing this over all blocks we

get the 0( log n) running time of the third phase with

P = n/ log n processors

This finally leads to the following theorem

Theorem 9 The rooting problem can be solved in 0( log n) w ith n/ log zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBAn processors on a CRCW PRAM

References

111

[21

[31 zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA

[41

[51

[61

[71

[81

K Abrahamson, N Dadoun, D.G Kirkpatrick and T

Przytycka, A simple parallel tree contraction algorithm, J

Algorithms 10 (1989) 287-302

0 Berkman, B Schieber and U Vishkin, Optimal doubly logarithmic parallel algorithms based on finding all nearest smaller values, J Algorithms 14 ( 1993)

R Cole and U Vishkin, Faster optimal parallel prefix sums and list ranking, Inform and Compur 81 (3) (1989) 334-

352

E Dekel and S Sahni, A parallel matching algorithm for convex bipartite graphs and applications to scheduling, J

Parallel Distributed Compur 1 (1984) 185-205

H.N Gabow and R.E Tarjan, A linear-time algorithm for a special case of disjoint set union, J Cornput Sy stem Sci 30 (1985) 209-221

E Gavril, Testing for equality between maximum matching and minimum node covering, Inform Process L&t 6 ( 1977) 199-202

E Glover, Maximum matching in a convex bipartite graph,

Naval Rex Logist Quart 14 (1967) 313-316

R.M Karp and V Ramachandran, A survey of parallel algorithms for shared-memory machines, in: J van Leeuwen, ed., Handbook of Theoretical Computer Science, Volume A:

Algorirhms and Complexity (Elsevier, Amsterdam, 1990) Chapter 17, pp 869-941

91 W Lipski and F.P Preparata, Efficient algorithms for finding maximum matchings in convex bipartite graphs and related problems, Acta Inform 15 (1981) 329-346

lo] G.L Miller and J.H Reif, Parallel tree contraction, in: Proc

26th IEEE Sy mp on Foundationsof Computer Science ( 1985)

478-489

Tiêu đề	Parallel maximum independent set in convex bipartite graphs
Tác giả	Artur Czumaj, Krzysztof Diks, Teresa M. Przytycka
Trường học	University of Paderborn
Chuyên ngành	Computer Science
Thể loại	Essay
Năm xuất bản	1996
Thành phố	Paderborn

Định dạng
Số trang	6
Dung lượng	567,07 KB