An open-ended collection of scalar types including in particular the type boolean or truth value Comment: The scalar types can be system- or user-defined, in general; thus, a means mus
Trang 1R"──assuming the INSERTs and DELETEs all succeed, of
course.)
■ Again, if we decide to treat join views in some special way, then consistency dictates that we treat EACH AND EVERY relational operator in its own special way──special rules for union, special rules for divide, and so on Everything becomes a special case (in fact, consistency dictates
inconsistency!) This surely can't be a good idea Of course, it's essentially what today's DBMSs all do, insofar
as they address the problem at all
The net of all this is that one simple rule that applies in all cases is surely the right way to go Especially since, in the example of S JOIN SP, we can achieve the desired DELETE behavior
by applying the DELETE direct to relvar SP instead of to the join view!
Of course, nothing in the foregoing argument precludes the possibility of placing logic in application code (sitting on top
of the DBMS) that (a) allows the join to be displayed as a single table on the screen, (b) allows the end user to remove a row from that table somehow, and (c) implements that removal by doing a DELETE on relvar SP (only) under the covers But we must avoid any suggestion that what the end user would be doing in such a scenario is a relational DELETE It's a different operation (and the user would need to understand that fact, in general), it has different semantics, and it should be given a different name
10.20 The relational model consists of five components:
1 An open-ended collection of scalar types (including in
particular the type boolean or truth value)
Comment: The scalar types can be system- or user-defined, in general; thus, a means must be available for users to define their own types (this requirement is implied, partly, by that
"open-ended") A means must therefore also be available for users to define their own operators, since types without
operators are useless The only built-in (i.e.,
system-defined) type we insist on is type BOOLEAN, but a real system will surely support integers, strings, etc., as well
2 A relation type generator and an intended interpretation for
relations of types generated thereby
Comment: The relation type generator allows users to define
their own relation types (in Tutorial D, the definition of a
given relation type is, typically, bundled in with the
definition of a relation variable of that type──there's no
Trang 2separate "define relation type" operator, for reasons
explained in detail in reference [3.3]) The intended
interpretation for a given relation type is the predicate
stuff
3 Facilities for defining relation variables of such generated
relation types
Comment: Of course! Note that relation variables are the
only variables allowed inside a relational database (The
Information Principle, in effect)
4 A relational assignment operation for assigning relation
values to such relation variables
Comment: Variables are updatable by definition (that's what
"variable" means); hence, every kind of variable is subject to assignment (that's how updating is done), and relation
variables are no exception Of course, INSERT, UPDATE, and DELETE shorthands are legal and indeed useful, but strictly
speaking they are only shorthands
5 An open-ended collection of generic relational operators for
deriving relation values from other relation values
Comment: These operators make up the relational algebra, and they're therefore built-in (though there's no inherent reason why users shouldn't be able to define additional ones) Note
that the operators are generic──i.e., they apply to all
possible relations, loosely speaking
Trang 3P A R T I I I
The database design problem can be stated as follows: Given some body of data to be represented in a database, how do we decide on
a suitable logical structure for that data? In other words, how
do we decide what relvars should exist and what attributes they
should have? (Of course, "design" here means logical or
conceptual design specifically The "right" way to do database design is to do a clean logical design first, and then, as a
separate and subsequent step, to map that logical design into
whatever physical structures the target DBMS happens to support Logical design is a fit subject for a book of this nature, but physical design──though important──isn't.)
One significant point of difference between the treatment of design issues in this book and that found in some other books is
the heavy emphasis on data integrity (the predicate stuff once
again)
Database design is, sadly, still more of an art than a
science It's true that there are some scientific principles that can be brought to bear on the problem, and those principles are the subject of Chapters 11-13; unfortunately, however, there are numerous design issues that those principles just don't address at all As a consequence, various design methodologies──some of them
fairly rigorous, others less so, but all of them ad hoc to a
degree──have been proposed, and such methodologies are the general subject of Chapter 14 (In fact, the principal focus of that
chapter is on "E/R modeling," since that particular methodology is the one most widely used in practice──despite the fact that, at least in my opinion, it suffers from a variety of serious
shortcomings Some of those shortcomings are identified in the chapter.)
Note: See the preface for a discussion of my reasons for
deferring the design chapters to what some might think is a fairly late part of the book.* Basically, I believe students aren't
ready to design databases properly, or to appreciate design issues fully, until they have some understanding of what databases are all about and how they're meant to be used
──────────
Trang 4* On the other hand, one reviewer of the previous edition
suggested that Part III should be omitted entirely and made into a whole new book!
──────────
None of the chapters in this part of the book has a "SQL
Facilities" section, for fairly obvious reasons
***
Trang 5Chapter 11
n c i e s
Principal Sections
• Basic definitions
• Trivial and nontrivial FDs
• Closure of a set of FDs
• Closure of a set of attributes
• Irreducible sets of FDs
General Remarks
This is the most formal chapter in the book But it isn't very
formal, and it isn't very long, and it can probably just be
skimmed if the instructor doesn't want to get too deeply into
formal proofs and the like Indeed, the chapter is included, in part, just to show that there really is some mathematical rigor underlying relational database theory But the focus of the book
in general is, as noted in the preface, on insight and
understanding, not on formalisms and algorithms (the latter can always be found in the references) Observe in particular that the book deliberately doesn't cover the theory of MVDs and JDs anywhere near as thoroughly as it does that of FDs
Be that as it may, the proofs (etc.) in this chapter aren't really difficult, though we all know that formalism and precise terminology can be a little daunting to the average reader
However, the following ideas, at least, do need to be explained:
• What an FD is, and the fact that the interesting ones are those that hold "for all time," meaning they're integrity
constraints (in fact, of course, the term "FD" is usually
taken to refer to this latter case specifically)
• The left and right sides of an FD are sets of attributes
• If K is a candidate key for R, then K → A holds for all
attributes A of R
• If R satisfies X → A and X is not a candidate key, then R
will probably involve some redundancy (a hint that the FD
notion might have a role to play in logical database
Trang 6design──we'll be wanting to get rid of redundancy and
therefore we'll be wanting to find ways to get rid of certain FDs)
• Some FDs imply others
• Given a set of FDs, the complete set of FDs implied by the
given set can be found by means of Armstrong's inference rules
or axioms (the rules should at least be mentioned, and perhaps
briefly illustrated, but they don't need to be exhaustively discussed)
11.2 Basic Definitions / 11.3 Trivial and Nontrivial FDs / 11.4 Closure of a Set of FDs / 11.5 Closure of a Set of Attributes / 11.6 Irreducible Sets of FDs
The material of these sections can be summarized as follows:
• First of all, every relvar necessarily satisfies certain
trivial FDs (an FD is trivial if and only if the right side is
a subset──not necessarily a proper subset, of course──of the left side)
• Given a set S of FDs, the closure S + of that set is the set
of all FDs implied by the FDs in S Armstrong's inference
rules provide a sound and complete basis for computing S + from
S (though we usually don't actually perform that computation) Several other useful rules can easily be derived from
Armstrong's rules (see the exercises)
• Given a set Z of attributes of relvar R and a set S of FDs
that hold for R, the closure Z + of Z under S is the set of all attributes A of R such that the FD Z → A is a member of S +
(i.e., such that the FD Z → A is implied by the FDs in S)
If and only if Z + is all of the attributes of R, Z is a
superkey for R (and a candidate key is an irreducible
superkey) There's a simple algorithm for computing Z + from Z and S, and hence a simple way of determining whether a given
FD X → Y is a member of S + (X → Y is a member of S + if and
only if Y is a subset of X +)
• Two sets of FDs S1 and S2 are equivalent if and only if
they're covers for each other, i.e., if and only if S1 + = S2 +
Every set of FDs is equivalent to at least one irreducible
set A set of FDs is irreducible if and only if all three of the following are true:
Trang 7a Every FD in the set has a singleton right side
b No FD in the set can be discarded without changing the
closure of the set
c No attribute can be discarded from the left side of any FD
in the set without changing the closure of the set
If I is an irreducible set equivalent to S, enforcing the FDs
in I will automatically enforce the FDs in S
The sections also contain three inline exercises:
• Check that the FDs stated to hold in the relation in Fig
11.1 do in fact hold Answer: Here, of course, we're talking about FDs that happen to hold in a specific relation value, not ones that hold for all time The exercise is trivial No further answer provided
• State the complete set of FDs satisfied by relvar SCP
Answer: The most important ones are clearly:
{ S#, P# } → QTY
S# → CITY
There are 83 additional FDs (!) implied by these two (i.e., the closure consists of 85 FDs in total)
• Prove the algorithm given in Fig 11.2 is correct No answer provided
Answers to Exercises
11.1 (a) An FD is basically a statement of the form A → B, where
A and B are each subsets of the set of attributes of R Given that a set of n elements has 2 n possible subsets, it follows that
each of A and B has 2 n possible values, and hence an upper limit
on the number of possible FDs in R is 2 2n (b) Every tuple t of R has the same value (namely, the 0-tuple) for that subtuple of t that corresponds to the empty set of attributes If B is empty, therefore, the FD A → B is trivially true for all possible sets A
of attributes of R; in fact, it's a trivial FD, in the sense of
that term as defined in Section 11.3, and it isn't very
interesting.* On the other hand, if A is empty, the FD A → B
means all tuples of R have the same value for B (since they
certainly all have the same value for A) And if B in turn is
"all of the attributes of R"──i.e., if R has an empty key──then R
Trang 8is constrained to contain at most one tuple (for further
discussion, see the answer to Exercise 9.10)
──────────
* If A is empty as well, the FD degenerates to {} → {}, which
has some claim to being "the least momentous observation that can
be made in Relationland" [6.5]
──────────
11.2 The rules are sound in the sense that, given a set S of FDs, FDs not implied by S can't be derived from S using the rules They're complete in the sense that all FDs implied by S can be so
derived
11.3 The reflexivity rule states that if B is a subset of A, then
A → B Proof: Let the relvar in question be R, and let t1 and t2 be any two tuples of R that agree on A Then certainly t1 and t2 agree on B Hence A → B
The augmentation rule states that if A → B, then AC → BC Proof: Again let the relvar in question be R, and let t1 and t2
be any two tuples of R that agree on AC Then certainly t1 and t2 agree on C They also agree on A, and therefore on B, because A
→ B Hence they agree on BC Hence AC → BC
The transitivity rule states that if A → B and B → C, then A
→ C Proof: Once again let the relvar in question be R, and let t1 and t2 be any two tuples of R that agree on A Then t1 and t2 agree on B, because A → B Hence they also agree on C, because B
→ C Hence A → C
11.4 The self-determination rule states that A → A Proof:
Immediate, by reflexivity
The decomposition rule states that if A → BC, then A → B and
A → C Proof: A → BC (given) and BC → B by reflexivity Hence
A → B by transitivity (and likewise for A → C)
The union rule states that if A → B and A → C, then A → BC Proof: A → B (given), hence A → BA by augmentation; also, A → C (given), hence BA → BC by augmentation Hence A → BC by
transitivity
Trang 9The composition rule states that if A → B and C → D, then AC
→ BD Proof: A → B (given), hence AC → BC by augmentation; likewise, C → D (given), hence BC → BD by augmentation Hence
AC → BD by transitivity
11.5 This proof requires intersection and difference, as well as union, of sets of attributes; we therefore show all three
operators explicitly, union included, in the proof (By contrast, previous proofs used simple concatenation of attributes to
represent union.)
1 A → B (given)
2 C → D (given)
3 A → B ∩ C (joint dependence, 1)
4 C - B → C - B (self-determination)
5 A ∪ ( C - B ) → ( B ∩ C ) ∪ ( C - B ) (composition, 3, 4)
6 A ∪ ( C - B ) → C (simplifying 5)
7 A ∪ ( C - B ) → D (transitivity, 6, 2)
8 A ∪ ( C - B ) → B ∪ D (composition, 1, 7)
This completes the proof
The rules used in the proof are as indicated in the comments The following rules are all special cases of Darwen's theorem: union, transitivity, composition, and augmentation So too is the following useful rule:
• If A → B and AB → C, then A → C
11.6 (a) The closure of a set of FDs is the set of all FDs that are implied by the given set (b) The closure of a set of
attributes is the set of all attributes that are functionally
dependent on the given set
11.7 The complete set of FDs──i.e., the closure──for relvar SP is
as follows:
{ S#, P#, QTY } → { S#, P#, QTY }
{ S#, P#, QTY } → { S#, P# }
{ S#, P#, QTY } → { P#, QTY }
{ S#, P#, QTY } → { S#, QTY }
{ S#, P#, QTY } → { S# }
{ S#, P#, QTY } → { P# }
{ S#, P#, QTY } → { QTY }
{ S#, P#, QTY } → { }
Trang 10{ S#, P# } → { S#, P#, QTY }
{ S#, P# } → { S#, P# }
{ S#, P# } → { P#, QTY }
{ S#, P# } → { S#, QTY }
{ S#, P# } → { S# }
{ S#, P# } → { P# }
{ S#, P# } → { QTY }
{ S#, P# } → { }
{ P#, QTY } → { P#, QTY }
{ P#, QTY } → { P# }
{ P#, QTY } → { QTY }
{ P#, QTY } → { }
{ S#, QTY } → { S#, QTY }
{ S#, QTY } → { S# }
{ S#, QTY } → { QTY }
{ S#, QTY } → { }
{ S# } → { S# }
{ S# } → { }
{ P# } → { P# }
{ P# } → { }
{ QTY } → { QTY }
{ QTY } → { }
{ } → { }
11.8 {A,C}+ = {A,B,C,D,E} The answer to the second part of the
question is yes
11.9 Two sets S1 and S2 of FDs are equivalent if and only if they
have the same closure
11.10 A set of FDs is irreducible if and only if all three of the following properties hold:
• Every FD has a singleton right side
• No FD can be discarded without changing the closure
• No attribute can be discarded from the left side of any FD without changing the closure