However, they have certain shortcomings when more complex database applications must be designed and implemented-for example, databases for engineering design and manufacturing CAD/CAM a
Trang 119.4 How are buffering and caching techniques used by the recovery subsystem?19.5 What are the before image (BFIM) and after image (AFIM) of a data item? What isthe difference between in-place updating and shadowing, with respect to theirhandling of BFIM and AFIM?
19.6 What are UNDO-type and REDO-type log entries?
19.7 Describe the write-ahead logging protocol
19.8 Identify three typical lists of transactions that are maintained by the recovery system
sub-19.9 What is meant by transaction rollback? What is meant by cascading rollback?Why do practical recovery methods use protocols that do not permit cascadingrollback? Which recovery techniques do not require any rollback?
19.10 Discuss the UNDO and REDO operations and the recovery techniques that use each.19.11 Discuss the deferred update technique of recovery What are the advantages anddisadvantages of this technique? Why is it called the NO-UNDO/REDO method?19.12 How can recovery handle transaction operations that do not affect the database,such as the printing of reports by a transaction?
19.13 Discuss the immediate update recovery technique in both single-user and tiuser environments What are the advantages and disadvantages of immediateupdate?
mul-19.14 What is the difference between the UNDO/REDO and the UNDO/NO-REDO rithms for recovery with immediate update? Develop the outline for an UNDO/NO-REDO algorithm
algo-19.15 Describe the shadow paging recovery technique Under what circumstances does
it not require a log?
19.16 Describe the three phases of the ARIES recovery method
19.17 What are log sequence numbers (LSNs) in ARIES? How are they used? What mation does the Dirty Page Table and Transaction Table contain? Describe howfuzzy checkpointing is used in ARIES
infor-19.18 What do the terms steal/no-steal and force/no-force mean with regard to buffermanagement for transaction processing
19.19 Describe the two-phase commit protocol for multidatabase transactions
19.20 Discuss how recovery from catastrophic failures is handled
Trang 219.21 Suppose that the system crashes before the [read_item,T3,A] entry is written to
the log in Figure 19.1b Will that make any difference in the recovery process?
19.22 Suppose that the system crashes before the [write_item,T2,D,25,26] entry is
writtentothe log in Figure 19.1b Will that make any difference in the recovery
process?
19.23 Figure 19.7 shows the log correspondingtoa particular schedule at the point of a
system crash for four transactions TI ,Tz,T 3, and T 4. Suppose that we use the
immediate update protocolwith checkpointing Describe the recovery process from
the system crash Specify which transactions are rolled back, which operations in
the log are redone and which (if any) are undone, and whether any cascading
rollback takes place
19.24 Suppose that we use the deferred update protocol for the example in Figure 19.7
Show how the log would be different in the case of deferred update by removing
the unnecessary log entries; then describe the recovery process, using your
modi-fied log Assume that onlyREDOoperations are applied, and specify which
opera-tions in the log are redone and which are ignored
19.25 How does checkpointing inARIESdiffer from checkpointing as described in
Sec-tion 19.1.4?
19.26 How are log sequence numbers used byARIESto reduce the amount ofREDOwork
needed for recovery? Illustrate with an example using the information shown in
Fig-ure 19.6 You can make your own assumptions as to when a page is written to disk
[write_item,T 2,D,15, 25]f- system crash
FIGURE19.7 An example schedule and its corresponding log
Trang 319.27 What implications would a no-steal/force buffer management policy have oncheckpointing and recovery?
Choose the correct answer for each of the following multiple-choice questions:
19.28 Incremental logging with deferred updates implies that the recovery system mustnecessarily
a store the old value of the updated item in the log
b store the new value of the updated item in the log
e store both the old and new value of the updated item in the log
d store only the Begin Transaction and Commit Transaction records in the log.19.29 The write ahead logging (WAL) protocol simply means that
a the writing of a data item should be done ahead of any logging operation
b the log record for an operation should be written before the actual data iswritten
e all log records should be written before a new transaction begins execution
d the log never needstobe written to disk
19.30 In case of transaction failure under a deferred update incremental logging scheme,which of the following will be needed:
a an undo operation
b a redo operation
e an undo and redo operation
d none of the above
19.31 For incremental logging with immediate updates, a log record for a transactionwould contain:
a a transaction name, data item name, old value of item, new value of item
b a transaction name, data item name, old value of item
e a transaction name, data item name, new value of item
d a transaction name and a data item name
19.32 For correct behavior during recovery, undo and redo operations must be
a searching the entire log is time consuming
b many redo's are unnecessary
e both (a) and (b)
d none of the above
19.34 When using a log based recovery scheme, it might improve performance as well asproviding a recovery mechanism by
a writing the log records to disk when each transaction commits
b writing the appropriate log records to disk during the transaction's execution
c waiting to write the log records until multiple transactions commit and ing them as a batch
writ-d never writing the log records to disk
Trang 419.35 There is a possibility of a cascading rollback when
a a transaction writes items that have been written only by a committed
19.36 To cope with media (disk) failures, it is necessary
a for theDBMS toonly execute transactions in a single user environment
b to keep a redundant copy of the database
c to never abort a transaction
d all of the above
19.37 If the shadowing approach is used for flushing a data item back to disk, then
a the item is written to disk only after the transaction commits
b the item is written to a different location on disk
c the item is written to disk before the transaction commits
d the item is written to the same disk location from which it was read
Selected Bibliography
The books by Bernstein et al (1987) and Papadimitriou (1986) are devoted to the theory
and principles of concurrency control and recovery The book by Gray and Reuter (1993) is
an encyclopedic work on concurrency control, recovery, and other transaction-processing
issues
Verhofstad (1978) presents a tutorial and survey of recovery techniques in database
systems Categorizing algorithms based on theirUNDO/REDOcharacteristics is discussed in
Haerder and Reuter (1983) and in Bernstein et al (1983) Gray (1978) discusses
recov-ery, along with other system aspects of implementing operating systems for databases The
shadow paging technique is discussed in Lorie (1977), Verhofstad (1978), and Reuter
(1980) Gray et al (1981) discuss the recovery mechanism in SYSTEM R.Lockeman and
Knutsen (1968), Davies (1972), and Bjork (1973) are early papers that discuss recovery
Chandy et al (1975) discuss transaction rollback Lilien and Bhargava (1985) discuss the
concept of integrity block and its use to improve the efficiency of recovery
Recovery using write-ahead logging is analyzed in [hingran and Khedkar (1992) and
isused in theARIESsystem (Mohan et al 1992a) More recent work on recovery includes
compensating transactions (Korth et al 1990) and main memory database recovery
(Kumar 1991) TheARIES recovery algorithms (Mohan et al 1992) have been quite
suc-cessful in practice Franklin et al (1992) discusses recovery in the EXODUS system Two
recent books by Kumar and Hsu (1998) and Kumar and Son (1998) discuss recovery in
detail and contain descriptions of recovery methods used in a number of existing
rela-tional database products
Trang 5OBJECT-RELATIONAL DATABASES
Trang 6Object Databases
In this chapter and the next, we discuss object-oriented data models and database
sys-terns.' Traditional data models and systems, such as relational, network, and hierarchical,
have been quite successful in developing the database technology required for many
tradi-tional business database applications However, they have certain shortcomings when
more complex database applications must be designed and implemented-for example,
databases for engineering design and manufacturing (CAD/CAM and CIM2), scientific
experiments, telecommunications, geographic information systems, and rnultimedia'
These newer applications have requirements and characteristics that differ from those of
traditional business applications, such as more complex structures for objects,
longer-duration transactions, new data types for storing images or large textual items, and the
need to define nonstandard application-specific operations Object-oriented databases
were proposed to meet the needs of these more complex applications The
object-oriented approach offers the flexibility to handle some of these requirements without
1.These darabases are often referred to as Object Databases and the systems are referred to as
Object Database Management Systems (ODBMS). However, because this chapter discusses many
general object-oriented concepts, wewilluse the termobject-orientedinstead of justobject.
2 Computer-Aided Design/Computer-Aided Manufacturing and Computer-Integrated
Manufac-turing
3.Multimedia databases must store various types of multimedia objects, such as video, audio,
images, graphics, and documents (see Chapter 24)
639
Trang 7being limited by the data types and query languages available in traditional database tems A key feature of object-oriented databases is the power they give the designer tospecify both the structure of complex objects and the operations that can be applied tothese objects.
sys-Another reason for the creation of object-oriented databases is the increasing use ofobject-oriented programming languages in developing software applications Databasesare now becoming fundamental components in many software systems, and traditionaldatabases were difficult to use with object-oriented software applications that aredeveloped in an object-oriented programming language such as C++, SMALLTALK, orJAVA Object-oriented databases are designed so they can be directly-or seamlessly-
integrated with software that is developed using object-oriented programming languages.The need for additional data modeling features has also been recognized by relationalDBMS vendors, and newer versions of relational systems are incorporating many of thefeatures that were proposed for object-oriented databases This has led to systems that arecharacterized asobject-relationalorextended relationalDBMSs (see Chapter22) The latestversion of the SQL standard for relational DBMSs includes some of these features
Although many experimental prototypes and commercial object-oriented databasesystems have been created, they have not found widespread use because of the popularity
of relational and object-relational systems The experimental prototypes included theORION system developed at MCC,4 OPENOODB at Texas Instruments, the IRIS system atHewlett-Packard laboratories, the ODE system at AT&T Bell Labs.? and the ENCORE!ObServer project at Brown University Commercially available systems includedGEMSTONE/OPAL of GemStone Systems, ONTOS of Ontos, Objectivity of Objectivity Inc.,Versant of Versant Object Technology, ObjectStore of Object Design, ARDENT ofARDENT Software," and POET of POET Software These represent only a partial list of theexperimental prototypes and commercial object-oriented database systems that werecreated
As commercial object-oriented DBMSs became available, the need for a standardmodel and language was recognized Because the formal procedure for approval ofstandards normally takes a number of years, a consortium of object-oriented DBMSvendors and users, called ODMG,7proposed a standard that is known as the ODMG-93standard, which has since been revised We will describe some features of the ODMGstandard in Chapter 21
Object-oriented databases have adopted many of the concepts that were developedoriginally for object-oriented programming languages.f In Section 20.1, we examine theorigins of the object-oriented approach and discuss how it applies to database systems.Then, in Sections20.2through20.6,we describe the key concepts utilized in many object-
4 Microelectronics and Computer Technology Corporation, Austin, Texas
5 Now called Lucent Technologies
6 Formerly 02 of 02 Technology
7 Object Database Management Group
8 Similar concepts were also developed in the fields of semantic data modeling and knowledgerepresentation
Trang 8oriented database systems Section 20.2 discussesobject identity, object structure, and type
constructors.Section 20.3 presents the concepts ofencapsulation of operationsand definition
ofmethods as part of class declarations, and also discusses the mechanisms for storing
objects in a database by making them persistent. Section 2004 describes type and class
hierarchies and inheritance in object-oriented databases, and Section 20.5 provides an
overview of the issues that arise when complex objects need to be represented and stored
Section 20.6 discusses additional concepts, including polymorphism, operator overloading,
dynamic binding, multiple and selective inheritance,andversioningandconfigurationof objects
This chapter presents the general concepts of object-oriented databases, whereas
Chapter 22 will present the ODMG standard The reader may skip Sections 20.5 and 20.6
ofthis chapter if a less detailed introduction to the topic is desired
CONCEPTS
This section gives a quick overview of the history and main concepts of object-oriented
databases, or OODBs for short The OODB concepts are then explained in more detail in
Sections 20.2 through 20.6 The termobject-oriented-abbreviatedby00or O-O-has its
origins in 00programming languages, or OOPLs Today 00 concepts are applied in the
areas of databases, software engineering, knowledge bases, artificial intelligence, and
com-puter systems in general OOPLs have their roots in the SIMULA language, which was
pro-posed in the late 1960s In SIMULA, the concept of a class groups together the internal
data structure of an object in a class declaration Subsequently, researchers proposed the
concept ofabstractdata type,which hides the internal data structures and specifies all
pos-sible external operations that can be applied to an object, leading to the concept of
encap-sulation. The programming language SMALLTALK, developed at Xerox PARC9 in the
1970s, was one of the first languages to explicitly incorporate additional 00 concepts,
such as message passing and inheritance.Itis known as apure00programming language,
meaning that it was explicitly designed tobe object-oriented This contrasts withhybrid
00programming languages, which incorporate00concepts into an already existing
lan-guage An example of the latter is C++, which incorporates00concepts into the popular
cprogramming language
An object typically has two components; state (value) and behavior (operations)
Hence, it is somewhat similar to aprogram variable in a programming language, except
that it will typically have acomplex data structureas well asspecific operationsdefined by
the programmer.10Objects in an OOPL exist only during program execution and are hence
calledtransient objects. An00database can extend the existence of objects so that they
are stored permanently, and hence the objectspersist beyond program termination and
can be retrieved later and shared by other programs In other words, 00 databases store
9 Palo Alto Research Center, Palo Alto, California
10.Objects have many other characteristics, as we discuss in the rest of this chapter
Trang 9persistent objectspermanently on secondary storage, and allow the sharing of these objectsamong multiple programs and applications This requires the incorporation of other well-known features of database management systems, such as indexing mechanisms,concurrency control, and recovery An 00database system interfaces with one or more
00programming languages to provide persistent and shared object capabilities
One goal of00 databases is to maintain a direct correspondence between real-worldand database objects so that objects do not lose their integrity and identity and can easily
be identified and operated upon Hence,00 databases provide a unique system-generated
object identifier(OID) for each object We can compare this with the relational model whereeach relation must have a primary key attribute whose value identifies each tuple uniquely
In the relational model, if the value of the primary key is changed, the tuple will have anew identity, even though it may still represent the same real-world object Alternatively,
a real-world object may have different names for key attributes in different relations,making it difficult to ascertain that the keys represent the same object (for example, theobject identifier may be represented asEMP_IDin one relation and asSSNin another).Another feature of 00 databases is that objects may have an object structure of
arbitrary complexity in order to contain all of the necessary information that describes theobject In contrast, in traditional database systems, information about a complex object isoften scattered over many relations or records, leading to loss of direct correspondencebetween a real-world object and its database representation
The internal structure of an object in OOPLs includes the specification of instancevariables, which hold the values that define the internal state of the object Hence, aninstance variable is similar to the concept of an attributein the relational model, exceptthat instance variables may be encapsulated within the object and thus are notnecessarily visible to external users Instance variables may also be of arbitrarily complexdata types Object-oriented systems allow definition of the operations or functions(behavior) that can be applied to objects of a particular type In fact, some 00 modelsinsist that all operations a user can apply to an object must be predefined This forces a
complete encapsulation of objects This rigid approach has been relaxed in most 00datamodels for several reasons First, the database user often needs to know the attributenames so they can specify selection conditions on the attributes to retrieve specificobjects Second, complete encapsulation implies that any simple retrieval requires apredefined operation, thus making ad hoc queries difficult to specify on the fly
To encourage encapsulation, an operation is defined in two parts The first part,called the signature or interface of the operation, specifies the operation name andarguments (or parameters) The second part, called the methodor body, specifies the
implementation of the operation Operations can be invoked by passing amessage to anobject, which includes the operation name and the parameters The object then executesthe method for that operation This encapsulation permits modification of the internalstructure of an object, as well as the implementation of its operations, without the need todisturb the external programs that invoke these operations Hence, encapsulationprovides a form of data and operation independence (see Chapter 2)
Another key concept in00systems is that of type and class hierarchies and inheritance.
This permits specification of new types or classes that inherit much of their structure and/oroperations from previously defined types or classes Hence, specification of object types can
Trang 10proceed systematically This makes it easier to develop the data types of a system
incrementally, and toreuseexisting type definitions when creating new types of objects
One problem in early 00database systems involved representingrelationshipsamong
objects The insistence on complete encapsulation in early 00data models led to the
argument that relationships should not be explicitly represented, but should instead be
described by defining appropriate methods that locate related objects However, this
approach does not work very well for complex databases with many relationships, because
it is useful to identify these relationships and make them visible to users The ODMG
standard has recognized this need and it explicitly represents binary relationships via a
pairofinversereferences-that is, by placing the OIDs of related objects within the objects
themselves, and maintaining referential integrity, as we shall describe in Chapter 21
Some 00 systems provide capabilities for dealing withmultiple versions of the same
object-a feature that is essential in design and engineering applications For example, an
old version of an object that represents a tested and verified design should be retained
until the new version is tested and verified A new version of a complex object may
include only a few new versions of its component objects, whereas other components
remain unchanged In addition to permitting versioning, 00databases should also allow
forschema evolution,which occurs when type declarations are changed or when new types
or relationships are created These two features are not specific to OODBs and should
ideally be included in all types of DBMSs.11
Another00concept isoperator overloading, which refers to an operation's ability to
be applied to different types of objects; in such a situation, an operation namemay refer to
several distinct implementations, depending on the type of objects it is applied to This
feature is also called operator polymorphism. For example, an operation to calculate the
area of a geometric object may differ in its method (implementation), depending on
whether the object is of type triangle, circle, or rectangle This may require the use oflate
bindingof the operation name to the appropriate method at run-time, when the type of
object to which the operation is applied becomes known
This section provided an overview of the main concepts of00databases In Sections
20.2 through 20.6, we discuss these concepts in more detail
AND TYPE CONSTRUCTORS
In this section we first discuss the concept of object identity, and then we present the
typ-ical structuring operations for defining the structure of the state of an object These
structuring operations are often called type constructors They define basic data-structuring
operations that can be combined to form complex object structures
- - -
-11.Several schema evolution operations, such asALTER TABLE,are already defined in the relational
standard (see Section 8.3)
Trang 1120.2.1 Object Identity
An00database system provides a unique identity to each independent object stored in thedatabase This unique identity is typically implemented via a unique, system-generated objectidentifier, or DID. The value of an OID is not visible to the external user, but it is usedinternally by the systemtoidentify each object uniquely and to create and manage inter-object references The OlDcan be assigned to program variables of the appropriate typewhen needed
The main property required of an OID is that it be immutable; that is, the OlD value
of a particular object should not change This preserves the identity of the real-worldobject being represented Hence, an 00database system must have some mechanism forgenerating OIDs and preserving the immutability property It is also desirable that eachOID be used only once; that is, even if an object is removed from the database, its OIDshould not be assigned to another object These two properties imply that the OIDshould not depend on any attribute values of the object, since the value of an attributemay be changed or corrected It is also generally considered inappropriate to base theOID on the physical address of the object in storage, since the physical address canchange after a physical reorganization of the database However, some systems do use thephysical address as OID to increase the efficiency of object retrieval If the physicaladdress of the object changes, an indirect pointercan be placed at the former address,which gives the new physical location of the object It is more common to use longintegers as OIDs and then to use some form of hash table to map the OID value tothecurrent physical address of the object in storage
Some early 00 data models required that everything-from a simple value to acomplex object-be represented as an object; hence, every basic value, such as an integer,string, or Boolean value, has an OID This allows two basic values to have different OIDs,which can be useful in some cases For example, the integer value 50 can be used sometimes
to mean a weight in kilograms and at other times to mean the age of a person Then, twobasic objects with distinct OIDs could be created, but both objects would represent theinteger value 50 Although useful as a theoretical model, this is not very practical, since itmay lead to the generation of too many OIDs Hence, most00 database systems allow forthe representation of both objects and values Every object must have an immutable OID,whereas a value has no OIDand just stands for itself Hence, a value is typically stored within
an object andcannot be referencedfrom other objects In some systems, complex structuredvalues can also be created without having a corresponding OID if needed
20.2.2 Object Structure
In 00 databases, the state (current value) of a complex object may be constructed fromother objects (or other values) by using certain type constructors One formal way of rep-resenting such objects is to view each object as a triple(i, c, v), whereiis a uniqueobject identifier(the OlD), c is atype constructor 12(that is, an indication of how the object state is
12.This is different from the constructor operation that is used inc++and other OOPLstocreatenew objects
Trang 12constructed), and v is the object state (or current value).The data model will typically
include several type constructors The three most basic constructors are atom, tuple, and
set Other commonly used constructors include list, bag, and array The atom
construc-tor is used to represent all basic atomic values, such as integers, real numbers, character
strings, Booleans, and any other basic data types that the system supports directly
The object statevof an object(i,c,v)is interpreted based on the constructor c.Ifc=
atom, the state (value)vis an atomic value from the domain of basic values supported by
the system.Ifc=set, the statevis aset of objectidentifiers {iI' iz, , in},which are the OIDs
for a set of objects that are typically of the same type If c=tuple, the statevis a tuple of
the form<al:il, az:iz, , an:in >,where eacha j is an attribute namel ' and eachi jis an OID
Ifc= list, the valuev is an ordered list [iI' iz, , in]of OIDs of objects of the same type A
list is similar to a set except that the OIDs in a list areordered, and hence we can refer to
the first, second, orlh object in a list For c= array, the state of the object is a
single-dimensional array of object identifiers The main difference between array and list is that
a list can have an arbitrary number of elements whereas an array typically has a maximum
size The difference between setand bagl4is that all elements in a set must be distinct
whereas a bag can have duplicate elements
This model of objects allows arbitrary nesting of the set, list, tuple, and other
constructors The state of an object that is not of type atom will refer to other objects by
their object identifiers Hence, the only case where an actual value appears is inthe state
ofan object of type atom.IS
The type constructors set, list, array, and bag are called collection types (or bulk
types), to distinguish them from basic types and tuple types The main characteristic of a
collection type is that the state of the object will be acollection of objects that may be
unordered (such as a set or a bag) or ordered (such as a list or an array) The tuple type
constructor is often called a structured type, since it corresponds to the struct construct
in theCandc++programming languages
EXAMPLE1: AComplex Object
We now represent some objects from the relational database shown in Figure 5.6, using
the preceding model, where an object is defined by a triple (OID, type constructor, state)
and the available type constuctors are atom, set, and tuple We useii' iz, i 3, •••to stand for
unique system-generated object identifiers Consider the following objects:
01 = (ii' atom, 'Houston')
Oz= (iz,atom, 'Bellaire')
03 = (i 3,atom, 'Sugarland')
13.Also called an instance variable name in00terminology
14 Also called a multiset
15.As we noted earlier, it is not practical to generate a unique system identifier for every value, so
real systems allow for bothOlfrsand structured value, which can be structured by using the same type
constructors as objects, except that a value does not have anaID
Trang 1304 = (i 4,atom, 5)
05 = (is, atom, 'Research')
06=(i 6,atom, '1988-05-22')
07 =(i7,set, {iI'iz,i3})
Os = (is, tuple,<DNAME:is, DNUMBER:i4, MGR:i9, LOCATIONS:i 7, EMPLOYEES:ilO,PROJECTS:i l l»
09 = (i 9,tuple,<MANAGER:i 12, MANAGER_START_DATE:i6»
010=(i1O'set,{in, i13,i14})
011 = (ill'set filS' i16, in}) 0lZ = (in,tuple,<FNAME:i lS' MINIT:i19, LNAME:i 20, SSN:iZl , ,SALARY:i z6, SUPERVISOR:in ,DEPT:i s»
The first six objects (01-06) listed here represent atomic values There will be many
similar objects, one for each distinct constant atomic value in the database.16Object07
is a set-valued object that represents the set of locations for department 5; the set{iI' iz,
i 3}refers to the atomic objects with values {'Houston', 'Bellaire', 'Sugarland'} ObjectOs
is a tuple-valued object that represents department 5 itself, and has the attributesDNAME, DNUMBER, MGR, LOCATIONS,and so on The first two attributesDNAME and DNUMBERhave atomicobjects Osand 04 as their values The MGR attribute has a tuple object 09 as its value,
which in turn has two attributes The value of the MANAGERattribute is the object whoseOID isin,which represents the employee 'John B Smith' who manages the department,whereas the value ofMANAGER_START_DATEis another atomic object whose value is a date Thevalue of the EMPLOYEESattribute ofOsis a set object withOID =ilO ,whose value is the set ofobject identifiers for the employees who work for theDEPARTMENT(objectsin,plusi13andi14,
which are not shown) Similarly, the value of thePROJECTSattribute ofOsis a set object withOID = ill'whose value is the set of object identifiers for the projects that are controlled bydepartment number 5 (objectsilS' i16,andin'which are not shown) The object whoseOID
=in represents the employee 'John B Smith' with all its atomic attributes (FNAME, MINH, LNAME, SSN, ••• , SALARY,that are referencing the atomic objectsi lS' i19,iZG'iZl' ,iZ6' respect-ively (not shown» plusSUPERVISORwhich references the employee object withOID =in(thisrepresents 'James E Borg' who supervises 'John B Smith' but is not shown) andDEPTwhichreferences the department object withOID=is(this represents department number 5 where'John B Smith' works)
Inthis model, an object can be represented as a graph structure that can be constructed
by recursively applying the type constructors The graph representing an object0i can beconstructed by first creating a node for the object0i itself The node for0i is labeled with theOIDand the object constructor c We also create a node in the graph for each basic atomic
16 These atomic objects are the ones that may cause a problem, due to the use of too many objectidentifiers, if this model is implemented directly
Trang 14value If an object0ihas an atomic value, we draw a directed arc from the node representing
0i to the node representing its basic value If the object value is constructed, we draw
directed arcs from the object node to a node that represents the constructed value Figure
20.1 shows the graph for the exampleDEPARTMENTobjectOsgiven earlier
The preceding model permits two types of definitions in a comparison of thestates of
two objectsfor equality Two objects are said to have identical states (deep equality) if the
graphs representing their states are identical in every respect, including the OIDs at every
level Another, weaker definition of equality is when two objects have equal states
(shallow equality) In this case, the graph structures must be the same, and all the
corresponding atomic values in the graphs should also be the same However, some
corresponding internal nodes in the two graphs may have objects withdifferent OIDs.
EXAMPLE 2: Identical Versus Equal Objects
A example can illustrate the difference between the two definitions for comparing object
statesfor equality Consider the following objectsOJ'0z,03' 04' 0S, and06:
OJ = (ij , tuple,<aj:i 4, az:i 6»
Oz = (iz,tuple,<aj:is,az:i 6»
03 = (i 3,tuple,<aj:i 4, az:i 6 »
04 = (i 4,atom, 10)
as= (is, atom, 10)
06 = (i 6,atom,20)
The objectsOJand0zhaveequalstates, since their states at the atomic level are the
same but the values are reached through distinct objects04and05. However, the states of
objectsOJand 03are identical, even though the objects themselves are not because they
have distinct OIDs Similarly, although the states of 04 and 05 are identical, the actual
objects04and05 are equal but not identical, because they have distinct OIDs
20.2.3 Type Constructors
An object definition language (ODL)j? that incorporates the preceding type constructors
can be used to define the object types for a particular database application In Chapter21,
we shall describe the standard ODL of ODMG, but we first introduce the concepts gradually
in this section using a simpler notation The type constructors can be used to define the
datastructures for an 00database schema.In Section 20.3we will see how to incorporate
the definition ofoperations (or methods) into the00schema Figure 20.2shows how we
may declare Employee and Department types corresponding to the object instances shown
17 This would correspond to the DDL (Data Definition Language) of the database system (see
Chapter 2)
Trang 15tuple
PROJECTS EMPLOYEES
i3:~3
atom
v3Sugarland
is: as) + - - - , tuple
"r11: 1 "~12: 2
V 1 v2Houston Bellaire
FIGURE 20.1 Representation of aDEPARTMENTcomplex object as a graph
in Figure 20.1 In Figure 20.2, the Date type is defined as a tuple rather than an atomicvalue as in Figure 20.1 We use the keywords tuple, set, and list for the type constructors,and the available standard data types (integer, string, float, and so on) for atomic types
Trang 16FIGURE20.2 Specifying the object types Employee, Date, and Department using
typeconstructors
Attributes that refer to other objects-such as dept of Employee or projects of
Department-are basically references to other objects and hence serve to represent
relationshipsamong the object types For example, the attribute dept of Employee is of type
Department, and hence is used to refer to a specific Department object (where the
Employee works) The value of such an attribute would be an OID for a specific Department
object A binary relationship can be represented in one direction, or it can have an inverse
reference. The latter representation makes it easy to traverse the relationship in both
directions For example, the attribute employees of Department has as its value a set of
references (that is, a set of OIDs) to objects of type Employee; these are the employees who
workfor the department The inverse is the reference attribute dept of Employee We will
see in Chapter 21 how the ODMG standard allows inverses to be explicitly declared as
relationship attributestoensure that inverse references are consistent
METHODS, AND PERSISTENCE
The concept ofencapsulationis one of the main characteristics of00 languages and
sys-tems.Itis also relatedtothe concepts ofabstractdatatypesandinformationhiding in
pro-gramming languages In traditional database models and systems, this concept was not
Trang 17applied, since it is customary to make the structure of database objects visible to users andexternal programs In these traditional models, a number of standard database operationsare applicable to objects of all types For example, in the relational model, the operationsfor selecting, inserting, deleting, and modifying tuples are generic and may be appliedto
any relation in the database The relation and its attributes are visible to users and toexternal programs that access the relation by using these operations
20.3.1 Specifying Object Behavior via Class Operations
The concepts of information hiding and encapsulation can be applied to database objects.The main idea is to define the behavior of a type of object based on the operations thatcan be externally applied to objects of that type The internal structure of the object ishidden, and the object is accessible only through a number of predefined operations.Some operations may be used to create (insert) or destroy (delete) objects; other opera-tions may update the object state; and others may be used to retrieve parts of the objectstate or to apply some calculations Still other operations may perform a combination ofretrieval, calculation, and update In general, the implementation of an operation can bespecified in ageneral-purpose programming languagethat provides flexibility and power indefining the operations
The external users of the object are only made aware of the interface of the objecttype, which defines the name and arguments (parameters) of each operation Theimplementation is hidden from the external users; it includes the definition of theinternal data structures of the object and the implementation of the operations thataccess these structures In00terminology, the interface part of each operation is calledthe signature, and the operation implementation is called a method Typically, a method
is invoked by sending a message to the object to execute the corresponding method.Notice that, as part of executing a method, a subsequent message to another object may
be sent, and this mechanism may be used to return values from the objects to the externalenvironment or to other objects
For database applications, the requirement that all objects be completelyencapsulated is too stringent One way of relaxing this requirement is to divide thestructure of an object into visible and hidden attributes (instance variables) Visibleattributes may be directly accessed for reading by external operators, or by a high-levelquery language The hidden attributes of an object are completely encapsulated and can
be accessed only through predefined operations Most OODBMSs employ high-level querylanguages for accessing visible attributes In Chapter 21, we will describe the OQL querylanguage that is proposed as a standard query language for OODBs
In most cases, operations thatupdatethe state of an object are encapsulated This is away of defining the update semantics of the objects, given that in many 00data models,few integrity constraints are predefined in the schema Each type of object has its integrityconstraints programmed into the methods that create, delete, and update the objects byexplicitly writing code to check for constraint violations and to handle exceptions Insuch cases, all update operations are implemented by encapsulated operations Morerecently, the ODL for the ODMG standard allows the specification of some common
Trang 18constraints such as keys and inverse relationships (referential integrity) so that the system
can automatically enforce these constraints (see Chapter 21)
The term class is often used to refer to an object type definition, along with the
definitions of the operations for that type.I SFigure 20.3shows how the type definitions of
Figure 20.2 may be extended with operations to define classes A number of operations
are declared for each class, and the signature (interface) of each operation is included in
the class definition A method (implementation) for each operation must be defined
elsewhere, using a programming language Typical operations include the object
constructor operation, which is used to create a new object, and the destructor
operation, which is used to destroy an object A number of object modifier operations can
define class Employee:
type tuple( fname:
define class Department
type tuple( dname: string;
assign_emp(e: Employee): boolean;
(*adds an employee to the department*)
remove_emp(e: Employee): boolean;
(*removes an employee from the department*)
end Department;
FIGURE20.3 Adding operations to the definitions of Employee and Department
18.This definition of class is similar to how it is used in the popular c++ programming language.
TheODMGstandard uses the word interface in additiontoclass(see Chapter21).In theEERmodel,
the term class was usedtorefertoan object type, along with the set of all objects of that type (see
Chapter4)
Trang 19also be declared to modify the states (values) of various attributes of an object Additionaloperations can retrieve information about the object.
An operation is typically applied to an object by using the dot notation For example,
if d is a reference to a department object, we can invoke an operation such as no_oCemps
by writing d.no_oCemps Similarly, by writing d.destroy_dept, the object referenced byd
is destroyed (deleted) The only exception is the constructor operation, which returns areference to a new Department object Hence, it is customary to have a default name forthe constructor operation that is the name of the class itself, although this was not used inFigure 20.3.19 The dot notation is also used to refer to attributes of an object-forexample, by writing d.dnumber or d.mgr.startdate
20.3.2 Specifying Object Persistence via Naming
The naming mechanism involves giving an object a unique persistent name throughwhich it can be retrieved by this and other programs This persistent object name can begiven via a specific statement or operation in the program, as illustrated in Figure20.4.Allsuch names given to objects must be unique within a particular database Hence, the namedpersistent objects are used as entry points to the database through which users andapplications can start their database access Obviously, it is not practical to give namestoallobjects in a large database that includes thousands of objects, so most objects are madepersistent by using the second mechanism, called reachability The reachability mechanismworks by making the object reachable from some persistent object An object B is said to bereachable from an object A if a sequence of references in the object graph lead from object
A to object B For example, all the objects in Figure 20.1 are reachable from object os;hence, if08is made persistent, all the other objects in Figure20.1also become persistent
Ifwe first create a named persistent object N, whose state is asetor listof objects ofsome classC,we can make objects ofC persistent by addingthemto the set or list, andthus making them reachable from N Hence, N defines a persistent collection of objects
of class C.For example, we can define a class DepartmentSet (see Figure 2004) whoseobjects are of type set(Department).20 Suppose that an object of type DepartmentSet is
19 Default names for the constructor and destructor operations exist in thec++programming
lan-guage For example, for class Employee, the default constructor name is Employee and the default
destructor nameis - Employee It is also commontouse the new operation to create new objects.
20 As we shall see in Chapter 21, the ODMG ODL syntax uses set<Department> instead ofsetf Department)
Trang 20define classDepartmentSet:
type set(Department);
operations add_dept(d: Department): boolean;
(* adds a department to the DepartmentSet object *)
remove_dept(d: Department): boolean;
(* removes a department from the DepartmentSet object *)
create jieptset: DepartmentSet;
destroydept set: boolean;
endDepartmentSet;
persistent nameAll Departments: DepartmentSet;
(' AIiDepartments is a persistent named object of type DepartmentSet *)
d:=create_dept;
(' create a new Department object in the variable d*)
b:=AIiDepartments.add_dept(d);
(' maked persistent by adding it to the persistent set AllDepartments *)
FIGURE20.4 Creating persistent objects by naming and reachability
created, and suppose that it is named AllDepartments and thus made persistent, as
illustrated in Figure 2004. Any Department object that is added to the set of
AllDepartments by using the add_dept operation becomes persistent by virtue of its being
reachable from AllDepartments The AllDepartments object is often called the extent of
the class Department, as it will hold all persistent objects of type Department As we shall
see in Chapter 21, the ODMG ODL standard gives the schema designer the option of
naming an extent as part of class definition
Notice the difference between traditional database models and00 databases in this
respect In traditional database models, such as the relational model or theEERmodel,all
objects are assumed to be persistent Hence, when an entity type or class such as EMPLOYEE
is defined in the EERmodel, it represents both the type declaration for EMPLOYEE and a
persistent setof allEMPLOYEE objects In the 00 approach, a class declaration of EMPLOYEE
specifies only the type and operations for a class of objects The user must separately
define a persistent object of type set(EMPLOYEE) or list(EMPLOYEE) whose value is thecollection
ofreferences to all persistent EMPLOYEE objects, if this is desired, as illustrated in Figure
20.4.21 This allows transient and persistent objects to follow the same type and class
declarations of theODLand theOOPL.In general, it is possible to define several persistent
collections for the same class definition, if desired
21 Some systems, such as automatically create the extent for a class