In this p per, we introduce the new concept of maximal family of a relation scheme.. It isknown [5] that if K is an arbitrary Sperner system over R, then there is a relation scheme s suc
Trang 1SOME OBSERVATIONS ON THE RELATION SCHEMES
vu Due THI
Abstract In this p per, we introduce the new concept of maximal family of a relation scheme The time
c mplexity of fn ing this family is presente in this p p r
Torn tiit Trong bai na , chung t6i trinh bay ho cu'c dai ciia mqt so·do q an h~
The relational datamodel which was introduced by E F Codd is one of the most powerful database models This paper gives some results about computatio al problems related to relation schemes Let us give some necessary definitions and results that are used in next section The concepts given in this section can be found in [1,2,4,6,7,8]
Let R = { a I, , an} be a nonempty finite set ofatributes A functional dependency (FD) is a statement of the form A - + B, where A , B ~ R The FD A - + B holds in a relation r ={hI, , hm}
over R ifVhi, hJ E r we have h;(a) =hJ(a) for all aE A implies hi(b) =hJ(b) for all bE B We also
s y that r satisfies the FD A - + B
Let FT be a family of all FDs that hold in r Then F =FTsatisfies
(1) A - + A E F,
(2) (A - + B E F , B - + C E F) => (A - + C E F) ,
( 3 ) (A - + B EF , A ~ C, D ~ B) => (C - + D EF) ,
(4) ( A - + B E F, C - + D E F ) => ( A uC -+ B u D E F )
A family of FDs satisfying (1) - (4) is called an I-amily (sometimes it is called the full family] over R
Clearly, FT is an I-family over R It is known [ that if F is an arbitrary I-family, then there
is arelation rover R such that FT= F.
Given a family F of FDs, there exists a unique minimal I-family F + that contains F It can be seen that F contains all FDs which can be derived from F by the rules (1) - ( 4 )
Arelato scheme s is apair ( R , F where R is a set ofattributes, and F is a set ofFDs over R.
Denote A + = { a : A - + { a } EF } A + is called the closure of A o er s It is clear that A - + B E F+
iffB <A +.
Clealy, if s = ( R , F ) is a relation scheme, then there isa relation rover R such that FT= F+
(see [ 1 ] ) Such a relation is called an Armstrong relation of s.
Let R be a n nempty finite set of attributes and P(R) its power set The mapping H : P(R) - +
P (R) isc lled a closure operation over R if for A , B E P (R), the following conditions are satisfied:
(1) A ~ H(A),
(2) A ~ B implies H(A) ~ H(B) ,
( 3 ) H(H(A)) =H(A).
Let s = ( R , F) be a relation scheme Set H., (A) = { a : A - + {a} EF }, we can see that H., is a closure operation over R.
Let r be a relation, s = ( R , F ) be a relatio scheme Then A is a key of r (a key of s) if
A R E FT ( A R E F ) A is a minimal key ofr( s ) if A is akey of r ( s ) and any proper subset
Trang 2ofA is not a key ofr( s )
Denote K; (K the set of all minimal keys ofr (s
Clearly, K " K , are Sperner systems o er R , i.e A , B EK , implies AI B
Let K be a Sperner system o er R We define the set of antikeys of K , denoted b K - 1, as follows:
K - 1 = {A c R: (B EK) =>( B If: A) an (A c C) = >( :l EK)(B ~ C)}.
It is easy to see that K - 1 is also a Sperner system o er R
It isknown [5] that if K is an arbitrary Sperner system over R, then there is a relation scheme
s such that K =K.
In this paper we always assume that if a Sperner system plays the role ofthe set of minimal keys (antikeys), then this Sperner system isnot empty (doesn't contain R). We co sider the comparison
of two attributes as an elementary step of algorithms Thus, if we assume that subsets of R a r
represented as sorted lists of attributes, then a Boolean operation o two subsets of R requires at most [ R! elementary steps
Let L ~ P ( R ) L is called a meet-irreducible family over R (sometimes i is called a family of members which are not intersectio s of two other members) ifVA , B , C EL , then A = BnC implies
A = A or A = C
Ilet I ~ P(R) , R I, and A, B l = > An B l I is called a meet-semilattice over R. Let
M ~ P(R). Denote M + = { M' :M' ~ M}. We say that M is a generator of I if M + = I Note
that R EM + b t not in M, by convention it isthe intersection of the empty collection of sets Denote N = {A E I A = f n{A' E I : A c A'}}.
In [5]it is proved that N is the unique minimal generator ofI.
It can be seen that N is afamily of members which are not intersectio s oftwo other members Let H be a closure o eration over R. Denote Z(H) = {A : H(A) = A} an N(H) = {A E
Z(H) : A = f n{A' E Z(H) : A c A'}} Z(H) is called the family of closed set s ofH. We say that
N (H) is the minimal generator of H.
It is sh wn [5] that if L is a meet-irreducible family then L is the minimal generator of some closure operation over R It is known [1]that there is an one-to-one correspondence between these families and ffamilies
Let r be a relatio o er R. Denote E; = {E i J : 1::; i <j :: ;I r l } , where E iJ = {a ER : hda) = hJ(a)}. Then E; is called the equality set ofr.
Let T; = {A E P( R ) : ::l Ei J = A, p Ep , : A e El '' !}' We say that T; is the maximal equality
system of r.
Let r be a relatio and K a Sperner system over R Wesay that r represents K if K; = K.
The following theorem isknown [7,10]
Theorem 1 1 Let K be a non - e";"p t y ' S e rn e s s t e m and r a r e l a tion ove r R Th e n r r e pre se nt s K
iff K-l =T" whe r T; is th e m axi m a l equal i ty s s t e m of r.
Let s = (R, F) be a relation scheme over R, K isa set of all minimal keys ofs Denote by tc ; '
the set of all antikeys ofs.
From Theorem l.1 we obtain the following corollary
Corollary 1 2. Let s = ( R, F ) be a re l at i o sc hem e an d r a r e l a tion o ve r R W e s ay that r r e pre s ent s
s if K ; = K • T h e n r r ep r ese nt s s iff K;l =T" wh e e T; i s th e maximal e qual i ty s s tem of r.
In [6]we proved the following theorem
Theorem 1 3 L e t r = {hl , , h m } be a r e lation , and Fan f-f a mily o v r R Th e n F;
eve r A ~ R
F iff for
Trang 3{ n EtJ
otherwise,
'I'heor ern 1.4. [3] Let K = {K1, ,Krn} be a Sperner system over R Set s
{K1 -> R, ,Km ->R} Then K = K.
( R, F ) with F
2 MAXIMAL FAMILY OF A RELATION SCHEME
In this section we introduce the new concept of maximal family of a relation scheme We show that the time complexity of finding a maximal family of a given relation scheme is exponetial in the number of attributes
Now we prove that the time complexity of finding a set of antikeys for relation scheme is expo-nential in the number of attributes We show that finding a maximal family of a relation scheme can
be polynomially transformed to this problem
Definition 2.1 Let s = (R, F) be a relation scheme Set H.(A) = A+ for all.A <:;; R. Put
Set M(s) = {(A,{a}) : a if A, A E Z(s) and BE Z(s) ' a if B, A <:; B imply A =B}. Then we say
that M(s) is a maximal family of s
Put I'll = {A, {a}) EM(S) and L(1',,) = {A : (A, {a}) ET,t}
Let s= (R, F) be a relation scheme over R. From s we construct Z( s) and compute the minimal generator N, of Z(s). We put
It is known [1] that for a given relation scheme s there is a relation rsuch that r is an Armstrong relation of s On the other hand, by Corollary 1.2 and Theorem 1.3 the following proposition is clear
Proposition 2.2 Let s = (R, F) be a relation scheme over R Then
tc;: = 1'.,
It is shown [7] that the problem of finding all antikeys of a relation is solved by polynomial time algorithm For a relation scheme we have the following theorem
Theorem 2.3. The time complexity of finding a set of all antikeys of a given relation scheme is
(1) There is an algorithm which finds a set of all antikeys of a given relation scheme in exponential time in the number of attributes
(2) There exists a relation scheme s = (R, F) such that the number ofelements of K; 1is exponential
in the number of attributes (in our example IK;ll is exponential not only in the number of attributes, but also in the number of elements of F).
For (1), we construct a following algorithm:
Let s = (R, F) be a relation scheme over R
Step 1: For every A <:;; R compute A + , and set Z(s) = {A+ : A <:;; R}
Step 2: Construct the minimal generator N, of Z(s)
Step 3: Compute the set 1', from N,.
According the to Proposition 2.2 we have 1',= K.,
Clearly, the time complexity of this algorithm is exponential in IR I ·
Trang 4As to (2): Let ustake apartition R =Xl U ·uXmuW, where m = [ n / 3 ] ' [ R I =nand I X I =3
(1 ~ i ~ m)
Set
K = {B : I B I =2, B ~ X i for some i} if I W I =0,
K = {B : I B I =2, B ~ Xi for some i :1~ i ~ m - 1or B ~ x ; UW} if I W I = 1,
K = { B : B I =2, B ~ X i for some i:1~ i ~m or B =W} if IWI =2
It iseasy to see that
tc : ' = {A: I A n xi l = 1, Vi} if I W I =0,
K - 1= {A :IA nX i I = 1( 1~ i ~m - 1) and IA n(X Tn UW) I = I} if IWI = 1,
K - l =' {A : I A nX i 1=1 (1 ~ i~m) and I A nW j =I} if I W I =2
Let f :N > N ( N is the set of natural numbers) be the functon defined as follows:
{ 3n/3 if n==0 (mod 3),
f( ) = ~.3 I n / 3 if n == 1 (mo 3) ,
2.3In/31 if n== 2 (mod 3)
It can be seen that f (n) = IK - l and 31n/41 <f(n).
It is clear that n - 1~ I K I ~ n+2, 31n / 4 < I K - l
Thus, if denote the elements of K by K l, , K t, then we set s = ( R, F ) , where F =
{Kl > R , , K t > R } By Theorem 1.4 K - l is the set of all antikeys of s Consequently, for an arbitrary set of attributes we can always construct a relation scheme s = ( R, F ) such that
IFI < IRI +2, but the number of antikeys of s is exponential not only in the number of attributes,
but also in the number ofelements of F The theorem is pro ed
According to Proposition 2.2 we show that finding a maximal family M( s ) c n be polynomially
transformed to problem of finding all antikeys ofgiven relato scheme
Al g orithm 2 4
In ut: Let s = ( R, F ) be relation scheme
Output: K 1.
Step 1: For each aE R we construct T a.
Step 2: Set
Ns= U L(T a ).
< E
Step 3: Put
« ; ' = {A E N , : lJ B E N, : A C B}.
Clearly, the steps 2and 3of this algorithm require polynomial time in the number of attributes
On the other hand, according to Theorem 2.3 we have the following
C oroll a ry 2 5 L et s= ( R , F ) be a r e l a t ion s ch e m e Then th e tim e co mplixity of f i nd i ng the fam i ly
M ( s) is exponentia l i n t h e n m be r i f attribute s.
[1] Armstrong W.W., Dependency structures of database relationships, I n f ormatio n Proces s ing,
Holland Publ Co., 74 (1974) 580-583
[2] Beeri C., Bernstein P A., Computatio al problems related to the desig of normal form rela
-to al schemas, A C M Tran s on Databa s e Sy s t 4 (1) (1979) 30-5
[3] Beeri C., Dowd M Fagin R., Staman R., On the structure of Armstrong relations for functional
dependencies, J A M31 (1) (1984) 3 -46
[4] Bekessy A., Demetrovics J., Contribution to the theory of database relations, Di s cr e te Math
27 (19 9) 1-10
Trang 5[5] Demetrovics J Logical and structural investigation of relational datamodel MTA - SZTAKI
T a u lm a yo k , Budape s t 114 (1980) 1-97 (Hungarian)
[6]Demetrovics J Thi V D., Some results about functional dependencies, Acta C yb e rn etica 8(3) (1988) 273-278
[7]Demetrovics J Thi V.D., Relatio s and minimal keys, Act a Cybe rn e t ic a 8 (3) (1988) 279-285
[8 ] Demetro ics J., Thi V.D., On keys in the relational datamo el I n f o rm P rocess Cybe r n E lK
24 (10) (1988) 515-5 9
[9]Demetrovics J., Thi V.D On algorithm for generating Armstro g relations and inferring func
-tonal dependencies in the relational datamodel, C omputer s an d Mathemati c s wi th App l c ation s
26 (4) (19 3) 43-55
[10] Demetrovics J Thi V.D., Armtrong relation, functional dependencies and stro g dependencie ,
Co mp ut and AI 14 (3) (1995) 279-2 8
[11] Mannila H., Raiha K.J., Design by example: an application of Armstrong relatio s, J Co m pu t
Sy s t Scien 33 (1986) 126-141
[12] Osborn S.L., Testing for existence of a covering Boyce-Codd normal form, Injor Pr oc Le it ,
8(1) (1979) 11-14
[1 ] Thi V.D Investigations on combinatorial characteriz tions related to functio al dependencies
in the relational datamodel, MTA - SZTAKI Tanulmanyok , Bu dape s t 191 (19 6) 1-1 7, Ph.D
Dissertatio (Hungarian)
[14] Thi V.D Minimal keys and antikeys, Acta Cybern e ti c a 7 (4) (198 ) 361-371
[15] Thi V D.On the antikeys in the relational datamodel, Alk a lma z ott Matematika i L apo k 12 (1986)
111-124 (Hungarian)
[1 ] Thi V D., Logical dependencies and irredundant relatio s, C omput e r and Art i icia l I nt e l ig e nce
7 (2) (1988) 16 -184
[17] Thi V D., Demetrovics J Some results about normal forms for functional dependency in the
relatio al datamodel, J Di s c r e t e Appl ie d Mathemati cs , N orth Holland 69 (1996) 6 -7 [1 ] Thi V.D., Demetrovics J., Describing Candidate Keys by hypergraphs, J Co mpu te r and A r
-tif i c ial Int e lligence 18 (2) (1999) 19 -207
[1 ] Thi V D Demetrovics J., Some computational problems related to Boyce-Co d n rmal form,
An a l e s U n ive r S c i Budape s t , Sect Compo No 19 (2000) 119-132
[20] Yu C.T., Johnson D.T On the complexity of finding the set of candidate keys for a given set
of functional dependencies, IPL 5 (4) (1976) 100-101
Received May 1 6, 2 000
In s titute of Information Technology