An Introduction to Database Systems 8Ed - C J Date - Solutions Manual Episode 2 Part 2 pot

The Principle of Orthogonal Design: Let A and B be any two base relvars* in the database.. ────────── * Recall that, from the user's point of view, all relvars are base ones apart from

Trang 1

A little more science! The Principle of Orthogonal Design: Let A and B be any two base relvars* in the database Then there must

not exist nonloss decompositions of A and B into A1, , Am and B1, , Bn (respectively) such that some projection Ai in the set A1, , Am and some projection Bj in the set B1, , Bn have

overlapping meanings (This version of the principle subsumes the simpler version, because one nonloss decomposition that always

exists for relvar R is the identity projection of R, i.e., the projection of R over all of its attributes.)

──────────

* Recall that, from the user's point of view, all relvars are

base ones (apart from views defined as mere shorthands); i.e., the principle applies to the design of all "expressible" databases,

not just to the "real" database──The Principle of Database

Relativity at work once again Of course, analogous remarks apply

to the principles of normalization also

──────────

It's predicates, not names, that represent data semantics

Mention "orthogonal decomposition" (this will be relevant when

we get to distributed databases in Chapter 21)

Violating The Principle of Orthogonal Design in fact violates The Information Principle! The principle is just formalized

common sense, of course (like the principles of further

normalization) Remind students of the relevance of the principle

to updating union, intersection, and difference views (Chapter 10)

13.7 Other Normal Forms

You're welcome to skip this section If you do cover it, note that there's some confusion in the literature over exactly what DK/NF is (see, e.g., "The Road to Normalization," by Douglas W

Hubbard and Joe Celko, DBMS, April 1994) Note: After I first

wrote these notes, the topic of DK/NF came up on the website

www.dbdebunk.com I've attached my response to that question as

an appendix to this chapter of the manual

References and Bibliography

Trang 2

Reference [13.15] is a classic and should be distributed to

students if at all possible

The annotation to reference [13.14] says this: "The two

embedded MVDs [in relvar CTXD] would have to be stated as

additional, explicit constraints on the relvar The details are

left as an exercise." Answer:

CONSTRAINT EMVD_ON_CTXD

CTXD { COURSE, TEACHER, TEXT } =

CTXD { COURSE, TEACHER } JOIN CTXD { COURSE, TEXT } ;

Note that this constraint is much harder to state in SQL, because

SQL doesn't support relational comparisons! Here it is in SQL: CREATE ASSERTION EMVD_ON_CTXD

( NOT EXISTS ( SELECT DISTINCT COURSE, TEACHER, TEXT

FROM CTXD AS CTXD1 WHERE NOT EXISTS

( SELECT DISTINCT COURSE, TEACHER, TEXT FROM ( ( SELECT DISTINCT COURSE, TEACHER

FROM CTXD ) AS POINTLESS1

NATURAL JOIN ( SELECT DISTINCT COURSE, TEXT

FROM CTXD ) AS POINTLESS2 ) )

AS CTXD2 WHERE CTXD1.COURSE = CTXD2.COURSE AND CTXD1.TEACHER = CTXD2.TEACHER AND CTXD1.TEXT = CTXD2.TEXT ) AND

( NOT EXISTS ( SELECT DISTINCT COURSE, TEACHER, TEXT

FROM ( ( SELECT DISTINCT COURSE, TEACHER

FROM CTXD ) AS POINTLESS1

NATURAL JOIN ( SELECT DISTINCT COURSE, TEXT

FROM CTXD ) AS POINTLESS2 ) )

AS CTXD2 WHERE NOT EXISTS

( SELECT DISTINCT COURSE, TEACHER, TEXT FROM CTXD AS CTXD1

WHERE CTXD1.COURSE = CTXD2.COURSE AND CTXD1.TEACHER = CTXD2.TEACHER AND CTXD1.TEXT = CTXD2.TEXT ) ; You might want to discuss this SQL formulation in detail

Answers to Exercises

13.1 Here first is the MVD for relvar CTX (algebraic version):

Trang 3

CONSTRAINT CTX_MVD CTX = CTX { COURSE, TEACHER } JOIN

CTX { COURSE, TEXT } ; Calculus version:

CONSTRAINT CTX_MVD CTX =

{ CTXX.COURSE, CTXX.TEACHER, CTXY.TEXT }

WHERE CTXX.COURSE = CTXY.COURSE ; CTXX and CTXY are range variables ranging over CTX

Second, here is the JD for relvar SPJ (algebraic version): CONSTRAINT SPJ_JD SPJ = SPJ { S#, P# } JOIN

SPJ { P#, J# } JOIN SPJ { J#, S# } ; Calculus version:

CONSTRAINT SPJ_JD SPJ =

{ SPJX.S#, SPJY.P#, SPJZ.J# } WHERE SPJX.P# = SPJY.P#

AND SPJY.J# = SPJZ.J# AND SPJZ.S# = SPJX.S# ; SPJX, SPJY, and SPJZ are range variables ranging over SPJ

13.2 Note first that R contains every a value paired with every b value, and further that the set of all a values in R, S say, is the same as the set of all b values in R Loosely speaking,

therefore, the body of R is equal to the Cartesian product of set

S with itself; more precisely, R is equal to the Cartesian product

of its projections R{A} and R{B} R thus satisfies the following MVDs (which are not trivial, please note, since they're certainly

not satisfied by all binary relvars):

{ } →→ A | B

Equivalently, R satisfies the JD *{A,B} (remember that join

degenerates to Cartesian product when there are no common

attributes) It follows that R isn't in 4NF, and it can be

nonloss-decomposed into its projections on A and B.* R is,

however, in BCNF (it's all key), and it satisfies no nontrivial FDs

──────────

* Those projections will have identical bodies, of course For that reason, it might be better to define just one of them as a

Trang 4

13.10

base relvar, and define R as a view over that base relvar (the

Cartesian product of that base relvar with itself, loosely

speaking)

──────────

Note: R also satisfies the MVDs

A →→ B | { }

and

B →→ A | { }

However, these MVDs are trivial, since they're satisfied by every binary relvar R with attributes A and B

13.3 First we introduce three relvars

REP { REP#, }

KEY { REP# } AREA { AREA#, }

KEY { AREA# } PRODUCT { PROD#, }

KEY { PROD# } with the obvious interpretation Second, we can represent the relationship between sales representatives and sales areas by a relvar

RA { REP#, AREA# }

KEY { REP#, AREA# }

and the relationship between sales representatives and products by

a relvar

RP { REP#, PROD# }

KEY { REP#, PROD# }

(both of these relationships are many-to-many)

Next, we're told that every product is sold in every area So

if we introduce a relvar

AP { AREA#, PROD# }

KEY { AREA#, PROD# }

Trang 5

to represent the relationship between areas and products, then we have the constraint (let's call it C) that

AP = AREA { AREA# } JOIN PRODUCT { PROD# }

Notice that constraint C implies that relvar AP isn't in 4NF (see Exercise 13.2) In fact, relvar AP doesn't give us any

information that can't be obtained from the other relvars; to be precise, we have

AP { AREA# } = AREA { AREA# }

and

AP { PROD# } = PRODUCT { PROD# }

But let's assume for the moment that relvar AP is included in our

design anyway

No two representatives sell the same product in the same area

In other words, given an {AREA#,PROD#} combination, there's

exactly one responsible sales representative (REP#), so we can

introduce a relvar

APR { AREA#, PROD#, REP# }

in which (to make the FD explicit)

{ AREA#, PROD# } → REP#

(of course, specification of the combination {AREA#,PROD#} as a key is sufficient to express this FD) Now, however, relvars RA,

RP, and AP are all redundant, since they're all projections of

APR; they can therefore all be dropped In place of constraint C,

we now need constraint C1:

APR { AREA#, PROD# } = AREA { AREA# } JOIN PRODUCT { PROD# } This constraint must be stated separately and explicitly (it isn't

"implied by keys")

Also, since every representative sells all of that

representative's products in all of that representative's areas,

we have the additional constraint C2 on relvar APR:

REP# →→ AREA# | PROD#

(a nontrivial MVD; relvar APR isn't in 4NF) Again the constraint must be stated separately and explicitly

Trang 6

13.12

Thus the final design consists of the relvars REP, AREA,

PRODUCT, and APR, together with the constraints C1 and C2:

CONSTRAINT C1 APR { AREA#, PROD# } =

AREA { AREA# } JOIN PRODUCT { PROD# } ; CONSTRAINT C2 APR =

APR { REP#, AREA# } JOIN APR { REP#, PROD# } ; This exercise illustrates very clearly the point that, in

general, the normalization discipline is adequate to represent

some semantic aspects of a given problem (basically, dependencies

that are implied by keys, where by "dependencies" we mean FDs,

MVDs, or JDs), but explicit statement of additional dependencies might also be needed for other aspects, and some aspects can't be represented in terms of such dependencies at all It also

illustrates the point (once again) that it isn't always desirable

to normalize "all the way" (relvar APR is in BCNF but not in 4NF)

Note: As a subsidiary exercise, you might like to consider

whether a design involving RVAs might be appropriate for the

problem under consideration Might such a design mean that some

of the comments in the previous paragraph no longer apply?

13.4 The revision is straightforward──all that's necessary is to replace the references to FDs and BCNF by analogous references to MVDs and 4NF, thus:

1 Initialize D to contain just R

2 For each non4NF relvar T in D, execute Steps 3 and 4

3 Let X →→ Y be an MVD for T that violates the requirements

for 4NF

4 Replace T in D by two of its projections, that over X and Y and that over all attributes except those in Y

13.5 This is a "cyclic constraint" example The following design

is suitable:

REP { REP#, }

KEY { REP# } AREA { AREA#, }

KEY { AREA# } PRODUCT { PROD#, }

KEY { PROD# }

Trang 7

RA { REP#, AREA# }

KEY { REP#, AREA# }

AP { AREA#, PROD# }

PR { PROD#, REP# }

KEY { PROD#, REP# }

Also, the user needs to be informed that the join of RA, AP, and

PR does not involve any "connection trap":

CONSTRAINT NO_TRAP

( RA JOIN AP JOIN PR ) { REP#, AREA# } = RA AND

( RA JOIN AP JOIN PR ) { AREA#, PROD# } = AP AND

( RA JOIN AP JOIN PR ) { PROD#, REP# } = PR ;

Note: As with Exercise 13.3, you might like to consider

whether a design involving RVAs might be appropriate for the

problem under consideration

13.6 Perhaps surprisingly, the design does conform to

normalization principles! First, SX and SY are both in 5NF

Second, the original suppliers relvar can be reconstructed by

joining SX and SY back together Third, neither SX nor SY is

redundant in that reconstruction process Fourth, SX and SY are independent in Rissanen's sense

Despite the foregoing observations, the design is very bad, of course; to be specific, it involves some obviously undesirable redundancy But the design isn't bad because it violates the

principles of normalization; rather, it's bad because it violates

The Principle of Orthogonal Design, as explained in Section 13.6

Thus, we see that following the principles of normalization are

necessary but not sufficient to ensure a good design We also see

that (as stated in Section 13.6) the principles of normalization

and The Principle of Orthogonal Design complement each other, in a

sense

Appendix (DK/NF)

This appendix consists (apart from this introductory paragraph) of the text──slightly edited here──of a message posted on the website

www.dbdebunk.com in May 2003 It's my response to a question from

someone I'll refer to here as Victor

(Begin quote)

Trang 8

13.14

Victor has "trouble understanding domain-key normal form

(DK/NF)." I don't blame him; there's certainly been some serious nonsense published on this topic in the trade press and elsewhere Let me see if I can clarify matters

DK/NF is best thought of as a straw man (sorry, straw person)

It was introduced by Ron Fagin in his paper "A Normal Form for

Relational Databases that Is Based on Domains and Keys," ACM TODS

6, No 3 (September 1981) As Victor says (more or less), Fagin defines a relvar R to be in DK/NF if and only if every constraint

on R is a logical consequence of what he (Fagin) calls the domain constraints and key constraints on R Here:

• A domain constraint──better called an attribute

constraint──is simply a constraint to the effect a given

attribute A of R takes its values from some given domain D

• A key constraint is simply a constraint to the effect that a

given set A, B, , C of R constitutes a key for R

Thus, if R is in DK/NF, then it is sufficient to enforce the

domain and key constraints for R, and all constraints on R will be

enforced automatically And enforcing those domain and key

constraints is, of course, very simple (most DBMS products do it already) To be specific, enforcing domain constraints just means checking that attribute values are always values from the

applicable domain (i.e., values of the right type); enforcing key constraints just means checking that key values are unique

The trouble is, lots of relvars aren't in DK/NF in the first

place For example, suppose there's a constraint on R to the

effect that R must contain at least ten tuples Then that

constraint is certainly not a consequence of the domain and key

constraints that apply to R, and so R isn't in DK/NF The sad

fact is, not all relvars can be reduced to DK/NF; nor do we know

the answer to the question "Exactly when can a relvar be so

reduced?"

Now, it's true that Fagin proves in his paper that if relvar R

is in DK/NF, then R is automatically in 5NF (and hence 4NF, BCNF,

etc.) as well However, it's wrong to think of DK/NF as another step in the progression from 1NF to 2NF to to 5NF, because 5NF

is always achievable, but DK/NF is not

It's also wrong to say there are "no normal forms higher than

DK/NF." In recent work of my own──documented in the book Temporal Data and the Relational Model, by myself with Hugh Darwen and

Nikos Lorentzos (Morgan Kaufmann, 2003)──my coworkers and I have

come up with a new sixth normal form, 6NF 6NF is higher than 5NF

(all 6NF relvars are in 5NF, but the converse isn't true);

Trang 9

moreover, 6NF is always achievable, but it isn't implied by DK/NF

In other words, there are relvars in DK/NF that aren't in 6NF A trivial example is:

EMP { EMP#, DEPT#, SALARY } KEY { EMP# }

(with the obvious semantics)

Victor also asks: "If a [relvar] has an atomic primary key

and is in 3NF, is it automatically in DK/NF?" No If the EMP

relvar just shown is subject to the constraint that there must be

at least ten employees, then EMP is in 3NF (and in fact 5NF) but not DK/NF (Incidentally, this example also answers another of Victor's questions: "Can [we] give "an example of a [relvar]

that's in 5NF but not in DK/NF?") Note: I'm assuming here

that the term "atomic key" means what would more correctly be

called a simple key (meaning it doesn't involve more than one

attribute) I'm also assuming that the relvar in question has

just one key, which we might harmlessly regard as the "primary"

key If either of these assumptions is invalid, the answer to the original question is probably "no" even more strongly!

The net of all of the above is that DK/NF is (at least at the time of writing) a concept that's of some considerable theoretical interest but not yet of much practical ditto The reason is that, while it would be nice if all relvars in the database were in

DK/NF, we know that goal is impossible to achieve in general, nor

do we know when it is possible For practical purposes, stick to

5NF (and 6NF) Hope this helps!

(End quote)

Trang 10

Chapter 14

Principal Sections

• The overall approach

• The E/R model

• E/R diagrams

• DB design with the E/R model

• A brief analysis

General Remarks

The field of "semantic modeling" encompasses more than just

database design, but for obvious reasons the emphasis in this

chapter is on database design aspects (though the first two

sections do consider the wider perspective briefly, and so does the annotation to several of the references at the end of the

chapter) The chapter shouldn't be skipped, but portions of it might be skipped You could also beef up the treatment of "E/R modeling" if you like

Let me repeat the following remarks from the preface to this manual:

You could also read Chapter 14 earlier if you like, possibly right after Chapter 4 Many instructors like to treat the entity/relationship material much earlier than I do For that reason I've tried to make Chapter 14 more or less

self-contained, so that it can be read "early" if you like

And the expanded version of these remarks from the preface to the book itself:

Some reviewers of earlier editions complained that database design issues were treated too late But it's my feeling that students aren't ready to design databases properly or to

appreciate design issues fully until they have some

understanding of what databases are and how they're used; in other words, I believe it's important to spend some time on the relational model and related matters before exposing the student to design questions Thus, I still believe Part III

is in the right place (That said, I do recognize that many instructors prefer to treat the entity/relationship material much earlier To that end, I've tried to make Chapter 14 more

Định dạng
Số trang	20
Dung lượng	133,4 KB