A bipartite graph G = (VW E) is called convex if the vertices in W can be ordered in such a way that the elements of W adjacent to any vertex u E V form an interval (i.e. a sequence consecutively numbered vertices). Such a graph can be represented in a compact form that requires O(n) space, where n = max{ IVI, 1WI}. G iven a convex bipartite graph G in the compact form Dekel and Sahni designed an 0( log* (n) )-time, n-processor EREW PRAM algorithm to compute a maximum matching in G. We show that the matching produced by their algorithm can be used to construct optimally in parallel a maximum set of independent vertices. Our algorithm runs in 0( logn) time with n/ logn processors on an Arbitrary CRCW PRAM. Keywords: Bipartite graphs; Convex graphs; Independent set; PRAM algorithms 1. Introduction An independent set of a graph is a subset of its ver- tices such that no two vertices in the subset are adja- cent. The problem of finding a maximum cardinality independent set (or shortly, the MIS problem) is one of the most fundamental problems in graph theory. If there are no restrictions on the input graph the MIS problem is known to be NP-complete. However, in the case of bipartite graphs the MIS problem is closely * Corresponding author. Email: artur@uni-paderbom.de. Sup- ported in part by DFG-Graduiertenkolleg “Parallele Rechnemet- zwerke in der Produktionstechnik”. ME 872/4- 1. t Email: diks@mimuw.edu.pl. Partly supported by EC Cooper- ative Action K-1000 (project ALTEC: Algorithms for Future Technologies). * Email: przytyck@cs.umbs.edu. related to a maximum matching problem and hence it can be solved in polynomial time [ 61. A subset M of edges of a graph G = (YE) is a matching if no two edges in M are incident to the same vertex; A4 is of maximum cardinality (or simply, a maximum matching) if it contains the maximum number of edges. The problem of finding a maximum
Trang 1Information Processing Letters 59 ( 1996) 289-294
Parallel maximum independent set in convex bipartite graphs
Artur Czumaj a,*, Krzysztof Diks b*l, Teresa M PrzytyckaCq2
a Heinz Nixdorflnstitute and Department of M athematics & Computer Science, Universiv of Paderbom, D- 33095 Paderborn, Germany zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
h Insty tut Informaty ki Uniw ersy tet W arszaw ski, PL- 02- 097 W arszaw a Poland
’ Department of Computer Science, University of M ary land, A.Y W illiams Bldg., College Park, M D 20742, USA
Received 20 January 1995; revised 19 August 1996 Communicated by M.J Atallah zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
Abstract
A bipartite graph G = (VW E) is called convex if the vertices in W can be ordered in such a way that the elements of
W adjacent to any vertex u E V form an interval (i.e a sequence consecutively numbered vertices) Such a graph can be
represented in a compact form that requires O(n) space, where n = max{ IVI, 1 WI} G iven a convex bipartite graph G in the
compact form Dekel and Sahni designed an 0( log* (n) )-time, n-processor EREW PRAM algorithm to compute a maximum
matching in G We show that the matching produced by their algorithm can be used to construct optimally in parallel a
maximum set of independent vertices Our algorithm runs in 0( logn) time with n/ logn processors on an Arbitrary CRCW
PRAM
Key words: Bipartite graphs; Convex graphs; Independent set; PRAM algorithms
1 Introduction
An independent set of a graph is a subset of its ver-
tices such that no two vertices in the subset are adja-
cent The problem of finding a maximum cardinality
independent set (or shortly, the MIS problem) is one
of the most fundamental problems in graph theory If
there are no restrictions on the input graph the MIS
problem is known to be NP-complete However, in
the case of bipartite graphs the MIS problem is closely
* Corresponding author Email: artur@uni-paderbom.de Sup-
ported in part by DFG-Graduiertenkolleg “Parallele Rechnemet-
zwerke in der Produktionstechnik” ME 872/4- 1
t Email: diks@mimuw.edu.pl Partly supported by EC Cooper-
ative Action K-1000 (project ALTEC: Algorithms for Future
Technologies)
* Email: przytyck@cs.umbs.edu
related to a maximum matching problem and hence it can be solved in polynomial time [ 61
A subset M of edges of a graph G = (YE) is a
matching if no two edges in M are incident to the same vertex; A4 is of maximum cardinality (or simply,
a maximum matching) if it contains the maximum number of edges The problem of finding a maximum cardinality matching is called the maximum matching problem
In this paper we address the problem of finding in parallel a maximum independent set in a special class
of graphs - convex bipartite graphs
Let G = (Vu! E) be an undirected bipartite graph,
where Y W are sets of vertices and E is a set of edges
of the form (u, w), with u E V and w E W The graph
G is convex if there is an ordering “<” of the elements
of W such that the vertices of W connected to any u E
0020-0190/96/$12.00 Copyright @ 1996 Published by Elsevier Science B.V All rights reserved
Trang 2290 zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBAA Czumaj et al./Information Processing Letters 59 (1996) 289-294
WI
w3
W4
Fig 1 A convex bipartite graph and its compact representation
V form an interval in this ordering, i.e for any u E V
and WI, w, w2 E W such that wi < w < ~2,
(u,wi) E Eand (u,w2) E E + (u,w) E E
Let n = max{ 1 VI, IWl} The number n is called size
of G Without loss of generality we will consider only
graphs without isolated vertices
Convex bipartite graphs were originally discussed
by Glover [ 71 and next by Lipski and Preparata [ 91
It is typical for applications involving convex bipartite
graphs that the graph G = (VW E) is given by spec-
ifying the ordering “<” and by specifying the end-
points beg(u) and end(u) of the interval of the ele-
ments of W connected to U, for every u E V Observe
that the size of such representation does not depend
on the number of edges in the graph We call this rep-
resentation compact If additionally the vertices of V
are ordered with respect to the end values then such
representation is called the sorted compact representa-
tion Note that the sorted compact representation can
be obtained from the convex representation by integer
sorting
An example of a convex graph and its sorted com-
pact representation are given in Fig 1
Given the sorted compact representation of a convex
graph G = (VU: E), Dekel and Sahni [ 4 ] designed
an 0( lo?( n) )-time, n-processor EREW PRAM al-
gorithm to compute a maximum matching in G The
algorithm of Dekel and Sahni is an example of a paral-
lel greedy algorithm It produces the (greedy) match-
ing M which has the following properties (recall that
both sets V, W are ordered):
l the smallest vertex in V is matched with its smallest
neighbor in W;
l if u E V is not the smallest vertex and it is matched
in M with a vertex w E W (i.e (u, w) E M) then
w is the smallest vertex among neighbors of u in W
not matched by any vertex u E V smaller than U
In a case of convex bipartite graphs such a matching
is a maximum one
In the sequential setting both, a maximum matching and a maximum independent set in a convex bipartite graph, can be computed in linear time [ 591
In this paper, we show that given the greedy match- ing in a convex bipartite graph G one can compute a maximum independent set in G in time O(logn) and with n/log n CRCW processors, where IZ is the size
of the input graph
To this end we give a parallel implementation of the following well known algorithm for computing a maximum independent set in a bipartite graph G =
Algorithm MIS
Direct every edge e E M from W tb V and every edgeeEE\MfromVtoW
Let VO be the set of unmatched vertices in V, find the sets VI 2 V and WI G W of vertices reachable from VO (VI includes VO)
Construct the maximum independent set as I = KU(W\Wl>
Thus the problem is reduced to finding all the ver- tices in G which are reachable from the unmatched vertices in V In parallel setting, this problem (for general bipartite graphs) falls into the group of prob- lems with so called “transitiveclosure bottleneck” [ 81
However, if the input graph is convex and represented
in the sorted compact form and if the matching M
is the greedy one then, as we show, our reachability problem has an interesting structure allowing us to de- sign an optimal parallel algorithm for the maximum independent set problem
2 The algorithm
Consider a convex bipartite graph G = ( YK E)
given in the sorted compact form For simplicity of further considerations we assume that both V and W are given as the sequences of integers 1 to /VI and 1 to
1 WI, respectively Let M be the greedy matching in G
Direct every edge e E M from W to V and every edge e E E \ M from V to W (We do not assign the
Trang 3/Informaiion 291
orientation to every single edge, as this would require
0( [RI) work, but simply add the corresponding infor-
mation to the compact representation of the graph.) A
directed edge going from a vertex u to a vertex w will
be denoted by u -+ w
For a set X 2 V zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBAU W let N(X) denote the set of
“outgoing” neighbors of the vertices in X, i.e
N(X) = {w 1 u E X and u f w
is a directed edge in G}
For every integer k > 1 let Rk( X) be the set of
vertices defined as follows:
Rk(X) =
N(N(Rk-l(X)>> U Rk-‘(X), k > 1
stead of N( {i}), Rk( {i}), R( {i}), respectively
Let R(X) denote the set of all vertices reachable
from X in an odd number of steps Observe that if u is
reachable from X then it is reachable in at most n - 1
steps Thus R(X) = Ur’J’2’ Rk(X) k-l
For simplicity we will write N(i), Rk (i), R(i) in-
Let VO be the set of all unmatched vertices in V Our
goal, as pointed in Introduction, is to compute the sets
WI = R( Vo) and VI = N( WI ) It is sufficient to show
how to compute R( VO) - the set of all vertices in W
reachable from the unmatched vertices in V (Since
easily compute VI from WI and M in constant time
with n processors.)
The basic idea of our approach is as follows First,
we show that for each i E Vi, R(i) is an interval
whose second endpoint is end(i) The first endpoints
for all i can be computed in O( 1) time with n proces-
sors Then, given a sequence of intervals sorted with
respect end values, we show how to compute the rep-
resentation of the union of these intervals as a union
of disjoint intervals With this representation, we can
decide if i E R( VO), for all vertex i E W, in constant
time with linear number of processors
In order to present our algorithm precisely we need
the following lemmas (Recall that the graph is given
in the sorted compact form.)
Lemma 1 For every i E VO and every integer k 3 1
the elements of Rk (i) form un interval [ bk (i) , ek (i) ]
in H!
Proof The proof is by induction on k
Ifk= 1 thenR’(i) = [b’(i),e’(i)],whereb’(i) =
beg(i) and e’(i) = end(i) Assume that k > 1 and that the lemma holds for k - 1,
Let Rk-’ (i) = [bk-‘(i),ek-l(i)] For j E
N( Rk-’ (i)) let m(j) be the vertex for which (j,m(j)) E M Then N(j) = EbegW,end(j)l \
{m(j)} Thus by h t e inductive hypothesis, it follows
that Rk(i) = [bk(i),ek(i)], where
bk(i> =min({bk-l(i)}
U {beg(j) 1 j E N(Rk-‘(i))}), and
ek(i> =max({ek-l(i)}
U{end(j) 1 j E N(Rk-l(i))}) 0
Lemma2 Let i c hand k 2 1 rfj E N(Rk(i))
then j < i
Proof Induction on k
k = 1: If j E N( R’ (i) > then there is a unique vertex k E R’(i) such that (j, k) l M Since i is unmatched and M is the greedy matching then j < i
k > 1: Assume that the lemma holds for k - 1 Consider the set N( Rk( i) \ Rk-’ (i) ) If this set is empty then the lemma holds for k obviously Suppose that it is non-empty and let j E N( Rk (i) \ Rk-’ (i) )
Then there are p E Rk-‘(i),q E N(Rk-‘(i)) and
r E Rk(i) \ Rk-‘(i) such that (q,p) E M,(q,r) E
E \ M and (j, r) E M By the induction hypothesis,
q < i and therefore end(q) < end(i) Since Rk-’ (i)
is an interval in W, thus r < p This observation and
the fact that M is the greedy matching imply j < q, otherwise the edge (q, r) would belong to M instead
of (4,~) and (j,r) El
Lemma3 Zfi E VO then R(i) = [b(i),e(i)],where b(i) = br”/21(i) and e(i) = end(i)
Proof The lemma follows immediately from Lem-
mas 1,2 and the fact that end(j) < end(i), for every
j<i Cl
Our next step is to show how to compute b(i) for every i E VO (recall that e(i) =end(i))
Trang 4292 A Czumaj et al/Information Processing Letters 59 (1996) 289-294
For every i E V let Q(i) be the set of all its neigh-
bors (not only the outgoing ones) matched in M and
letq(i) beavertexinN(Q(i)) suchthatbeg(q(i)) =
min({beg(j> / j E N(Q(i)))) Define a function
next : v -+ v as follows
next(i) = 1,
i
beg(i) G b&q(i)),
q(i), otherwise
Informally, next( i) is the vertex of V with the smallest
beg value that can be reached from the vertex i in at
most 2 steps via its matched neighbors in W
Let next0(i) = i,ne&(i) = next(ne&‘(i)),
for every k 2 1, and let next*(i) = j be such that
ne_xt( j) = j and nextk (i) = j, for some k 2 0 Ob-
serve that if i # next(i) then beg(next(i)) < beg(i)
Thus pointers next are the parent pointers of a rooted
forest Furthermore, the function next has the follow-
Lemma 4 For every i E V, and every integer k 3 1
Rk(i) = [beg(ne&‘(i)),end(i)]
Proof The proof is by induction on k
Since nat”( i) = i and R’(i) = [beg(i),end(i)]
the lemma holds for k = 1
Assume that k > 1 and the lemma holds for every
positive integer 1 < k
By the induction hypothesis,
Rk-‘(i) = [beg(nextk-*(i)),end(i)]
It follows from the definition of the function next
that Rk(i) > [beg(ne&-](i)),end(i)] Suppose
that there is j E Rk(i) \ [beg(nextk-‘(i)),end(i)]
Then there are p E Rk-‘(i),t E N(Rk-l(i)) such
that (t,p) E M and (t, j) E E \ M Notice that
beg(t) < j < beg(ne&‘(i)) Let ko 6 k - 1 be
the smallest positive integer such that p E Rb (i) By
the induction hypothesis,
p E [beg(nextk0-‘(i)),end(i)l
and
p 4 [beg(ne&‘-*(i)),end(i)]
Then
beg(ne_&(i)) < beg(t) < j < beg(nex@‘(i))
Since
beg(ne&‘(i)) < beg(ne&(i))
for every k 2 0, we get a contradiction 0
Corollary 5 For every i E Vo
R(i) = [beg(next*(i)),end(i)]
If all next(j) are known then one can compute
next* (j) in 0( log n) time with n/ log n processors us- ing the tree contraction technique We must be careful here, as tree contraction algorithms are usually pre- sented in the context of a tree where each internal node has associated with a list of its children Unfortunately
in our application this is not the case For complete- ness of the presentation we show, in the Appendix, a technique that allows to avoid this restriction
We concluded that computing of intkrvals R(i) re-
duces to computing function next Function next can
be computed in 0( log n) time with n/ log n proces- sors as follows
LetW= (WI, , wlMl ) be the increasing sequence
of the vertices of W matched in M W is easy to com- pute using the prefix computation Moreover, using the prefix computations one can easily compute two tables A[ l lWl] and C[l lWJ] such that for every w E W,
A[ w] is the largest index j such that Wj < w and
C [ w] is the smallest index j such that Wj > w For ev-
ery i E V let a(i) and c(i) be the smallest and largest indices such that Wa(i) 2 beg(i) and WC(i) < end(i) Indices a(i) and c(i) can be computed with a linear work using tables A and C Observe now that q(i) is a vertex with the minimum beg value among all vertices
in V matched with vertices Wa(i), Wa(i)+l, , WC(i)
In order to compute q(i) (and hence next(i) ) , for all
i E V, one can apply the algorithm for the range min-
imum searching problem [ 21 It takes 0( log n) time
on an n/ log n-processor CREW PRAM
Thus, we know how to compute a representation of
R( VO) as a union of n intervals sorted with respect to
the second endpoint Our final step is to simplify this representation to the union of non-intersecting inter- vals
Lemma6 Let II, ZP where Zi = [ bi, ei] be the sequence of intervals such thatfor i < j, ei 6 ej Then, the set of intervals Z{, , Z: such that for any i # j
Trang 5A czumaj et al /Information Processing Letters 59 (1996) 289-294 293
zi’nz~=0andz~Uz~ Uz,=z~Uz~ Uz~cunbe
computed in 0( log n) time with n/ log n processors zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA Proof First, we eliminate every interval Zj such that
Zj is contained in an interval Zf with j’ > j To find
these intervals we consider the sequence of the first
endpoints of the intervals II, , Zp For each element
in this sequences we find closest dominating (i.e not
larger than the given element) successor If such a
successor exists, then the interval is eliminated This
can be done in O(logn) time with n/ logn proces-
sors [ 21 In this way we obtain a sequence of intervals
sorted with respect to both endpoints Now, the inter-
vals Ii, Z: can be computed using the list ranking
technique 0
Thus we can conclude the paper with the following
theorem
Theorem 7 Given a (sorted, compact representa-
tion of) convex bipartite graph G = ( Y W, E) of size n
and the greedy matching M one can compute a maxi-
mum independent set of vertices in G in 0( log n) time
using n/ log n processors of a CRCW PRAM
Appendix A Trre contraction in the absence of
an Euler tour
In this Appendix, we show how to solve optimally
the rooting problem in a forest Given a forest F de-
fined by the parent’s relation (i.e., each node u has
a pointer p(v) to its parent) with nodes { 1, ,n}
A node u is a root if p(u) = u The rooting problem
is to find for each node u the root r(u) of the tree it
belongs to
Given an Euler tour of each tree (or given for every
node the list of its children), the rooting problem can
be solved in 0( log n) time with n/ log n processors us-
ing standard tree-contraction algorithms [ 10,l ] How-
ever this technique cannot deal with unbounded de-
gree trees when no Euler tour is given We show an
approach that circumvents this assumption and design
an 0( logn)-time algorithm for the rooting problem
that employs O(n) operations
Our algorithm consists of three phases First, we
reduce the problem of size n to the problem of size
n/ log n Then we solve the smaller problem using
the non-optimal algorithm of Miller and Reif [lo], and finally combine the information from the smaller problem to all other nodes in the forest
A.Z TheJirst phase The first phase reduces the number of vertices in the
forest to at most n/ log n Let P = n/ log n, A be the
array containing all the nodes of F, B an empty array
of length n, and for each node u, succ( u) = p(u)
We repeat the following process until the size of A
is smaller than n/log n
Finding all leaves and chains Split the nodes in A
into three groups: (i) the leaves L, that is, all nodes v
E A that for no u E A, succ( u) = u; (ii) the nodes on chains C, that is, the nodes u E A - L such that for exactly one u E A, succ(u) = u; and (iii) the other nodes One can easily verify which node belongs to which of these sets in 0( 1 Al/P) time with P proces-
sors, and then, using the prefix sums algorithm of Cole and Vishkin [ 31, rearrange them to store in consecu- tivepartsofAintimeO(~A~/P+log~A~/loglog~A~)
with P processors
Remove all leaves If u E L then we set the pointer
to the node which will find the root of u, PT(u) = succ( u), and remove u from the array A All leaves are stored at the first free entries of an array B
Halves the chains C is a collection of lists Us- ing the algorithm of Cole and Vishkin [3], find a maximal independent set MIS in C in O(lCl/P +
log ICI / log log 1 C I ) time with P processors Addi- tionally we require that the last element from each list belongs to MIS
l For each node u E C - MIS, if succ(u) E MIS,
then PT( u) = succ( u), and otherwise PT( u) =
succ(succ(u)) Remove u from A and store all nodes from E C - MIS at the first free entries of B
l For each node from MIS that is not the last ver- tex on a chain, if succ(succ(u)) E MIS then set succ( u) = succ(succ( v)), and otherwise set succ(u) =succ(succ(succ(u)))
Fact 8 Phase 1 can be pelformed in O(logn) time with P = n/ log n processors
Trang 6294 zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBAA Czumaj et al./Informa~ion Processing Letters 59 (1996) 289-294 zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
Proof Let N, denote the size of A before iteration t
Standard arguments (see e.g [ zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA10,l ] ) can be applied
to show that N,+I < $N, Hence N,+l 6 n(t)’ and
there are at most log log n iteration of the loop
The running time of iteration t is O( N,/P +
log N,/ log log NI) with P processors Summing this
together we get the running time bounded by
log log ”
log log N, >
El
log log n
Q + c O(N,) +loglognO
(
log Nr log log Nt >
r=l
A.2 The second phase
Now we perform standard tree-contraction algo-
rithm (e.g [ lo] ) for the forest defined by the relation
WCC in the array A It runs in 0( log N) time and uses
N processors, where N is the number of vertices in
the forest Since in our case N = n/ log n, this yields
to an 0( log n)-time n/ log n-processors algorithm
A.3 The third phase
In this step we have to combine the information
computed for the nodes that were left after Phase 1 to
obtain the pointers to the root for all the nodes in F
Observe that the nodes stored in B are ordered with
respect to the time when they were removed from A
This gives us a partition of B into blocks of nodes
that were removed at the same iteration Since they
are at most log log n blocks, we can analyze them suc-
cessively, one by one, in the reverse order of the time
when the nodes from given block were removed Then
using the information of the root of all the nodes in
the already analyzed part of B, w e can compute r(u)
using the pointer PT( 0) If Bi denotes the size of the
ith block, then each block can be analyzed in con-
stant time using Bi processors, or in O (Bi/P) time
using P processors Summing this over all blocks we
get the 0( log n) running time of the third phase with
P = n/ log n processors
This finally leads to the following theorem
Theorem 9 The rooting problem can be solved in 0( log n) w ith n/ log zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBAn processors on a CRCW PRAM
References
111
[21
[31 zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
[41
[51
[61
[71
[81
K Abrahamson, N Dadoun, D.G Kirkpatrick and T
Przytycka, A simple parallel tree contraction algorithm, J
Algorithms 10 (1989) 287-302
0 Berkman, B Schieber and U Vishkin, Optimal doubly logarithmic parallel algorithms based on finding all nearest smaller values, J Algorithms 14 ( 1993)
R Cole and U Vishkin, Faster optimal parallel prefix sums and list ranking, Inform and Compur 81 (3) (1989) 334-
352
E Dekel and S Sahni, A parallel matching algorithm for convex bipartite graphs and applications to scheduling, J
Parallel Distributed Compur 1 (1984) 185-205
H.N Gabow and R.E Tarjan, A linear-time algorithm for a special case of disjoint set union, J Cornput Sy stem Sci 30 (1985) 209-221
E Gavril, Testing for equality between maximum matching and minimum node covering, Inform Process L&t 6 ( 1977) 199-202
E Glover, Maximum matching in a convex bipartite graph,
Naval Rex Logist Quart 14 (1967) 313-316
R.M Karp and V Ramachandran, A survey of parallel algorithms for shared-memory machines, in: J van Leeuwen, ed., Handbook of Theoretical Computer Science, Volume A:
Algorirhms and Complexity (Elsevier, Amsterdam, 1990) Chapter 17, pp 869-941
91 W Lipski and F.P Preparata, Efficient algorithms for finding maximum matchings in convex bipartite graphs and related problems, Acta Inform 15 (1981) 329-346
lo] G.L Miller and J.H Reif, Parallel tree contraction, in: Proc
26th IEEE Sy mp on Foundationsof Computer Science ( 1985)
478-489