Date page 20.2 • Every type has all of the following among other things: ■ An associated type constraint, which defines the set of legal values of the type in question ■ At least one de
Trang 1Chapter 20
T y p e I n h e r i t a n c e
Principal Sections
• Type hierarchies
• Polymorphism and substitutability
• Variables and assignments
• S by C
• Comparisons
• Operators, versions, and signatures
• Is a circle an ellipse?
• S by C revisited
• SQL facilities
General Remarks
Note the opening remarks:
This chapter relies heavily on material first discussed in Chapter 5 If you originally gave that chapter a "once over lightly" reading, therefore, you might want to go back and revisit it now before studying the present chapter in any
depth.
To be more specific, a clear understanding of the following is prerequisite:
• What a type is (reviewed in Section 20.1)
• The crucial distinction between values and variables (see
Section 5.2) Note: Object-based discussions typically fall
foul of this distinction, since they're often unclear as to whether an "object" is a value, or a variable, or both, or neither This failure seems to be at the root of the famous (infamous?) debate as to whether, e.g., a circle is an
ellipse See Section 20.8
• The crucial distinction between read-only and update
operators (again, see Section 5.2) Note: The point is that read-only operators apply to values (possibly values that are
the current values of variables), while update operators apply
Trang 2Copyright (c) 2003 C J Date page 20.2
• Every type has all of the following (among other things):
■ An associated type constraint, which defines the set of
legal values of the type in question
■ At least one declared possible representation, together with a corresponding selector operator and a corresponding set of THE_ operators (or logical equivalents of same)
■ "=" and ":=" operators
■ Certain type testing operators, to be discussed in Section
20.6 (these operators might be unnecessary in the absence
of inheritance support); also TREAT DOWN, to be discussed
in Section 20.4
All of these bullet items except the last are also explained
in Chapter 5
The following preliminaries from Section 20.1 are also
important:
• Values are typed (i.e., have actual "most specific" types)
• Variables are typed (i.e., have declared types)
• We consider single inheritance only in this chapter, for
simplicity, though our model in fact supports multiple
inheritance too
• We consider scalar inheritance only in this chapter, for
simplicity, though our model in fact supports tuple and
relation inheritance too Throughout the chapter, value,
variable, and so on, thus mean scalar value, scalar variable,
and so on
• We're not talking about "subtables and supertables"!──we'll
do that in Chapter 26
The chapter overall is somewhat forward-looking (most database products don't provide any inheritance support, yet) In fact, at the time of writing, this book appears to be the only database textbook to include a serious discussion of type inheritance at all (Of course, it's true that the topics are somewhat
orthogonal──data doesn't have to be in a database for the concept
of inheritance to apply to it──but we might say the same about the relational model, in a way.) Also, what discussions there are in other books (i.e., nondatabase books──typically books on object orientation) seem to confuse some very fundamental issues In
Trang 3this connection, note the remarks in the annotation to reference [20.2]! Note too the discussion in Chapter 26, Section 26.3,
subsection "Pointers and a Good Model of Inheritance Are
Incompatible," which claims, implicitly, that it's really objects
and a good model of inheritance that are incompatible (since, as we'll see in Chapter 25, pointers in the shape of object IDs are a
sine qua non of object orientation*) An odd state of affairs, in
a way, since most of the work on inheritance seems to have been
done in an object context specifically
──────────
* I note in passing that this remark applies to SQL in
particular, again as we'll see in Chapter 26 But it doesn't
apply just to languages in which the pointers are explicit, as
they are in SQL──it also applies to languages like Java where
they're supposed to be completely implicit
──────────
Be that as it may, the chapter──which can be skipped or
skimmed if desired──presents a new model for inheritance, based on the proposals of reference [3.3] It's concerned primarily with inheritance as a semantic modeling tool rather than as a software engineering tool, though we (i.e., Hugh Darwen and myself) believe the model described can meet the usual software engineering
objectives──in particular, the code reuse objective──as well
Note: We justify the emphasis on the first of these two
objectives by appealing to the fact that semantic modeling is more directly pertinent to the database world than software engineering
is
Our model regards operators and constraints (i.e., type
constraints) as inheritable and structure as not inheritable
This position is uncontroversial with respect to operators but
possibly controversial with respect to constraints and structure.*
We insist on inheriting constraints because if (e.g.) a given
circle violates the constraint for type ELLIPSE, then that circle isn't an ellipse! We insist on not inheriting structure because
in our model there isn't any structure to inherit (structure is
part of the implementation, not part of the model)
──────────
* Note in particular that SQL doesn't support type constraints at all, and therefore certainly doesn't support type constraint
Trang 4Copyright (c) 2003 C J Date page 20.4
──────────
Some further points to note:
• This chapter is deliberately included in this part of the book instead of Part VI in order to stress the point that the topic
of inheritance, though much discussed in connection with
object orientation, doesn't necessarily have anything to do
with OO, and is in fact best discussed outside the OO context
• Indeed, OO confuses the picture considerably, because (as
already noted) the distinction between values and variables is absolutely crucial in this context, and that's a distinction that some people, at least, in the object world seem unwilling
to make Perhaps this fact explains why previous attempts at inheritance models haven't been very successful?
• What's more (I've already mentioned this point, but it's
worth repeating and emphasizing), it's our contention that if
"OO" is understood to include the notion of OIDs (see Chapter
25), then in fact it's incompatible with the notion of a
reasonable inheritance model (i.e., one that's "faithful to
reality") In other words, OIDs and a good inheritance model can't possibly coexist, in our opinion See the notes on
Section 20.8
• To quote Section 20.1: "The subject of type inheritance
really has to do with data in general──it isn't limited to
just database data in particular For simplicity, therefore,
most examples in the chapter are expressed in terms of local data (ordinary program variables, etc.) rather than database data."
20.2 Type Hierarchies
Type hierarchies are pictures──they're not really part of our
inheritance model as such (much as "tables" are pictures, not part
of the relational model as such) In other words, type
hierarchies are just a convenient way of depicting certain
relationships among types (supertype-subtype relationships, to be precise)
In case anyone asks: Type (e.g.) CIRCLE is not really "just circles," it's "circles at a certain position in the plane." This point notwithstanding, the book deliberately uses a rather
academic example in order that the semantics can be crystal clear
to everyone (?)
Trang 5The subsection entitled "Terminology" is important, though
fortunately straightforward Ditto "The Disjointness Assumption,"
and its corollary that every value has exactly one most specific type
A slightly unfortunate fact: Although we're primarily
concerned with an inheritance model, there are certain
implementation issues that you do need to understand in order to understand the overall concept of inheritance properly One
example: The fact that B is a subtype of A doesn't necessarily mean that the actual (hidden) representation of B values is the same as that of A values Implication: Distinct implementations
("versions") of operators might be necessary under the covers This point will become significant in the next section, among
others
The section includes this text: "So long as (a) there's at
least one type and (b) there are no cycles──i.e., there's no
sequence of types T1, T2, T3, , Tn such that T1 is an immediate subtype of T2, T2 is an immediate subtype of T3, , and Tn is an immediate subtype of T1──then at least one type must be a root
type Note: In fact, there can't be any cycles (why not?)."
Answer: Suppose types A and B were each a subtype of the other (a cycle of length two) Then the set of values constituting A would
be a subset of the set of values constituting B and vice versa;
hence, both types would consist of exactly the same set of values
Likewise, the set of operators that applied to values of type A
would be a subset of the set of operators that applied to values
of type B and vice versa (and, of course, the set of constraints that applied to values of type A would be a subset of the set of constraints that applied to values of type B and vice versa) In other words, A and B would effectively be identical, except for
their names, so they might as well be collapsed into a single type (in fact, we would have a violation of the model on our hands if they weren't) And, of course, an analogous argument applies to cycles of any length
20.3 Polymorphism and Substitutability
Really the same thing Note the need to be careful over the
distinction between arguments and parameters (logical
difference!) Distinguish between overloading and inclusion
polymorphism; in this chapter, "polymorphism" means the latter
unless otherwise stated Caveat: Unfortunately, many writers use
the term "overloading" to mean, specifically, inclusion
polymorphism No wonder this subject is so confusing
Trang 6Copyright (c) 2003 C J Date page 20.6
Run-time binding: CASE statements and expressions move under
the covers "Old code can invoke new code." Note: As a matter
of fact, an implementation that did all binding at compile time (on the basis, obviously, of declared types, not most specific
types) would almost conform to our model, because we require the
semantics of operators not to change as we travel down paths in the type hierarchy (see Section 20.7) The reason I say "almost" here, however, is that compile-time binding clearly won't work──in
fact, it's impossible──for dummy types Dummy types aren't
discussed in detail in the book, however; see reference [3.3] for further details
Substitutability──more precisely, value substitutability──is
the justification for inheritance!
20.4 Variables and Assignments
Important message: Values retain their most specific type on
assignment to variables of less specific declared type (type
conversion does not occur on such assignment) Hence, a variable
of declared type T can have a value whose most specific type is
any subtype of T So we also need to be careful over the
difference between the declared type of a given variable and the actual (most specific) type of the current value of that variable (another important logical difference) Formal model of a
variable, and more generally of an expression: DT, MST, v
components
If operator Op is defined to have a result of declared type T,
then the actual result of an invocation of Op can be of any
subtype of type T Note: We deliberately do not drag in the (in
our experience, rather confusing and unhelpful) terms and concepts
result covariance and argument contravariance "Result
contravariance" is just an obvious consequence of substitutability (what's more, the term doesn't seem to capture the essence of the phenomenon properly) And we don't believe in "argument
contravariance" at all, for reasons articulated in reference
[3.3]
TREAT DOWN (important); possibility of run-time type errors (in this context and nowhere else)
20.5 S by C
Basic idea: If variable E of declared type ELLIPSE is updated in
such a way that now THE_A(E) = THE_B(E), then MST(E) is now
CIRCLE After all, human beings know that an ellipse with equal semiaxes is really a circle, so the system ought to know the same
Trang 7thing──otherwise the model can hardly be said to be "faithful to reality" or "a good model of reality."
Caveat: Most inheritance models do not support S by C; in
fact, some writers are on record as arguing that an inheritance
model should explicitly not support it (see, e.g., reference
[20.12]) By contrast, we believe an inheritance model is useful
as "a model of reality" only if it does support S by C (and we believe we know how to implement it efficiently, too)
Be warned that the term "S by C" (or something very close to
it, anyway) is used elsewhere in the literature with a very
different meaning; see, e.g., reference [20.14], where it's used
to refer to what would better be called just type constraint
enforcement Here's the definition from that reference:
(Begin quote)
"Specialization via constraints happens whenever the following is
permitted:
B subtype_of A and T subtype_of S and
f ( b:T, ) returns r:R in Ops(B) and
f ( b:S, ) returns r:R in Ops(A)
That is, specialization via constraints occurs whenever the
operation redefinition on a subtype constrains one of the
arguments to be from a smaller value set than the corresponding operation on the supertype."
(End quote)
This definition lacks somewhat in clarity, it might be felt Anyway, S by C (in our sense) implies, very specifically, that
a selector invocation might have to return a value of more
specific type than the specified "target" type In other words,
the implementation code for S by C is embedded in selector code (That implementation code can probably be provided automatically, too.)
Explain G by C as well
20.6 Comparisons
Self-explanatory──though the implications for join etc sometimes come as a bit of a surprise
Trang 8Copyright (c) 2003 C J Date page 20.8
Explain IS_T and the new relational operator R:IS_T(A) Note:
Generalized versions of these operators are defined in reference [3.3]
20.7 Operators, Versions, and Signatures
Much confusion in the literature over different kinds of
signatures! Need to distinguish specification signature (just one
of these) vs version signatures (many) vs invocation signatures
(also many) More logical differences here, in fact
Changing operator semantics as we travel down the type
hierarchy is, regrettably, possible but (we believe) nonsense Arguments in favor are (we believe) based on a confusion between inclusion and overloading polymorphism and smack of "the
implementation tail wagging the model dog" [3.3] Changing
semantics is illegal in our model
Discuss union types briefly (or at least mention them) Note:
Some proposals──e.g., ODMG [25.11]──use union types as a way of
providing type generator functionality E.g., RELATION might be a
union type in such a system (with generic operators JOIN, UNION, and so forth), and every specific relation type would then be a proper subtype of that union type We don't care for this
approach ourselves, because we certainly don't want our support for type generators to rely on support for type inheritance
What's more, the approach seems to imply that specific──i.e.,
explicitly specialized──implementation code must be provided for each specific join, each specific union, etc., etc.: surely not a very desirable state of affairs? How can it be justified?
The section shows an explicit implementation of the MOVE
operator (read-only version) that moves circles instead of
ellipses, and then remarks that "there's little point in defining such an explicit [implementation] in this particular example (why,
exactly?)." Answer: Because S by C will take care of the
problem!
20.8 Is a Circle an Ellipse?
IMPORTANT!──albeit self-explanatory, more or less.* But you
should be aware that this is another, and major, area where we
depart from "classical" inheritance models To be specific, it's
here that the value vs variable and read-only vs update operator
distinctions come into play Other approaches don't make these distinctions; they thus allow operators (update as well as read-only operators) to be inherited indiscriminately──with the result that they have to support "noncircular circles" and similar
Trang 9nonsenses, and they can't support type constraints at all! (SQL
is very unfortunately a case in point here See Section 20.10.)
──────────
* I don't much care for "advertisements for myself," but I do think you should take a look at reference [20.6] if you propose to teach the material of this section
──────────
The section includes the following text: "[Let] type ELLIPSE
have another immediate subtype NONCIRCLE; let the constraint a > b
apply to noncircles; and consider an assignment to THE_A for a
noncircle that, if accepted, would set a equal to b What would
be an appropriate semantic redefinition for that assignment?
Exactly what side effect would be appropriate?" No answer
provided!──the questions are rhetorical, as should be obvious
20.9 S by C Revisited
This section begins by criticizing the common example of colored circles as a subtype of circles Note that there can't be more
instances (meaning more values) of a subtype than of any supertype
of that subtype, yet there are clearly more colored circles than there are circles And colored circles can't be obtained from circles via S by C, either Note the remark to the effect that
"COLORED_CIRCLE is a subtype of CIRCLE to exactly the same extent that it is a subtype of COLOR (which is to say, not at all)." In
my experience, most students find this point telling
Discussion of this example leads to the position that S by C
is the only conceptually valid means of defining subtypes──the exact opposite of the position articulated in reference [20.12] and subscribed to by much of the object world
20.10 SQL Facilities
Extremely unorthogonal!──basically single inheritance only, for
"structured types" only.* (Multiple inheritance might be added in SQL:2003.)
──────────
Trang 10Copyright (c) 2003 C J Date page
20.10
* As the book says: SQL has no explicit inheritance support for generated types, no explicit support for multiple inheritance, and
no inheritance support at all for built-in types or DISTINCT
types But it does have some very limited implicit support for
inheritance of generated types and for multiple inheritance
──────────
Explain the SQL analog of circles and ellipses Inheritance not of constraints and (read-only) operators but structure and (all) operators; explain implications! Functions, procedures, and methods Observers, mutators, and constructors No type
constraints; this omission is staggering but a necessary
consequence of SQL's inheritance model (?) Do not get into
details of reference types or subtables and supertables here
(we'll cover them in Chapter 26, after we've discussed OO in
Chapter 25)
Explain delegation──it's pragmatically important, but it's not
inheritance (in our opinion)
References and Bibliography
We repeat the opening paragraph from this section:
(Begin quote)
For interest, we state here without further elaboration the sole major changes required to [our single] inheritance model in order to support multiple inheritance First, we relax the
disjointness assumption by requiring only that root types must be
disjoint Second, we replace the definition of "most specific
type" by the following requirement: Every set of types T1, T2, ., Tn (n ≥ 0) must have a common subtype T' such that a given value is of each of the types T1, T2, , Tn if and only if it is
of type T' See reference [3.3] for a detailed discussion of
these points, also of the extensions required to support tuple and relation inheritance
(End quote)
Reference [20.1] describes a commercial implementation of the
inheritance model as described in the body of the chapter
Reference [20.10] is a good example of what happens if the value
vs variable and read-only vs update operator distinctions are
ignored; unfortunately, it very much reflects what SQL does (see Section 20.10) Reference [20.12] is interesting as an example of how the object world thinks about inheritance, though we caution