DATABASE SYSTEMS (phần 17) doc

However, they have certain shortcomings when more complex database applications must be designed and implemented-for example, databases for engineering design and manufacturing CAD/CAM a

Trang 1

19.4 How are buffering and caching techniques used by the recovery subsystem?19.5 What are the before image (BFIM) and after image (AFIM) of a data item? What isthe difference between in-place updating and shadowing, with respect to theirhandling of BFIM and AFIM?

19.6 What are UNDO-type and REDO-type log entries?

19.7 Describe the write-ahead logging protocol

19.8 Identify three typical lists of transactions that are maintained by the recovery system

sub-19.9 What is meant by transaction rollback? What is meant by cascading rollback?Why do practical recovery methods use protocols that do not permit cascadingrollback? Which recovery techniques do not require any rollback?

19.10 Discuss the UNDO and REDO operations and the recovery techniques that use each.19.11 Discuss the deferred update technique of recovery What are the advantages anddisadvantages of this technique? Why is it called the NO-UNDO/REDO method?19.12 How can recovery handle transaction operations that do not affect the database,such as the printing of reports by a transaction?

19.13 Discuss the immediate update recovery technique in both single-user and tiuser environments What are the advantages and disadvantages of immediateupdate?

mul-19.14 What is the difference between the UNDO/REDO and the UNDO/NO-REDO rithms for recovery with immediate update? Develop the outline for an UNDO/NO-REDO algorithm

algo-19.15 Describe the shadow paging recovery technique Under what circumstances does

it not require a log?

19.16 Describe the three phases of the ARIES recovery method

19.17 What are log sequence numbers (LSNs) in ARIES? How are they used? What mation does the Dirty Page Table and Transaction Table contain? Describe howfuzzy checkpointing is used in ARIES

infor-19.18 What do the terms steal/no-steal and force/no-force mean with regard to buffermanagement for transaction processing

19.19 Describe the two-phase commit protocol for multidatabase transactions

19.20 Discuss how recovery from catastrophic failures is handled

Trang 2

19.21 Suppose that the system crashes before the [read_item,T3,A] entry is written to

the log in Figure 19.1b Will that make any difference in the recovery process?

19.22 Suppose that the system crashes before the [write_item,T2,D,25,26] entry is

writtentothe log in Figure 19.1b Will that make any difference in the recovery

process?

19.23 Figure 19.7 shows the log correspondingtoa particular schedule at the point of a

system crash for four transactions TI ,Tz,T 3, and T 4. Suppose that we use the

immediate update protocolwith checkpointing Describe the recovery process from

the system crash Specify which transactions are rolled back, which operations in

the log are redone and which (if any) are undone, and whether any cascading

rollback takes place

19.24 Suppose that we use the deferred update protocol for the example in Figure 19.7

Show how the log would be different in the case of deferred update by removing

the unnecessary log entries; then describe the recovery process, using your

modi-fied log Assume that onlyREDOoperations are applied, and specify which

opera-tions in the log are redone and which are ignored

19.25 How does checkpointing inARIESdiffer from checkpointing as described in

Sec-tion 19.1.4?

19.26 How are log sequence numbers used byARIESto reduce the amount ofREDOwork

needed for recovery? Illustrate with an example using the information shown in

Fig-ure 19.6 You can make your own assumptions as to when a page is written to disk

[write_item,T 2,D,15, 25]f- system crash

FIGURE19.7 An example schedule and its corresponding log

Trang 3

19.27 What implications would a no-steal/force buffer management policy have oncheckpointing and recovery?

Choose the correct answer for each of the following multiple-choice questions:

19.28 Incremental logging with deferred updates implies that the recovery system mustnecessarily

a store the old value of the updated item in the log

b store the new value of the updated item in the log

e store both the old and new value of the updated item in the log

d store only the Begin Transaction and Commit Transaction records in the log.19.29 The write ahead logging (WAL) protocol simply means that

a the writing of a data item should be done ahead of any logging operation

b the log record for an operation should be written before the actual data iswritten

e all log records should be written before a new transaction begins execution

d the log never needstobe written to disk

19.30 In case of transaction failure under a deferred update incremental logging scheme,which of the following will be needed:

a an undo operation

b a redo operation

e an undo and redo operation

d none of the above

19.31 For incremental logging with immediate updates, a log record for a transactionwould contain:

a a transaction name, data item name, old value of item, new value of item

b a transaction name, data item name, old value of item

e a transaction name, data item name, new value of item

d a transaction name and a data item name

19.32 For correct behavior during recovery, undo and redo operations must be

a searching the entire log is time consuming

b many redo's are unnecessary

e both (a) and (b)

d none of the above

19.34 When using a log based recovery scheme, it might improve performance as well asproviding a recovery mechanism by

a writing the log records to disk when each transaction commits

b writing the appropriate log records to disk during the transaction's execution

c waiting to write the log records until multiple transactions commit and ing them as a batch

writ-d never writing the log records to disk

Trang 4

19.35 There is a possibility of a cascading rollback when

a a transaction writes items that have been written only by a committed

19.36 To cope with media (disk) failures, it is necessary

a for theDBMS toonly execute transactions in a single user environment

b to keep a redundant copy of the database

c to never abort a transaction

d all of the above

19.37 If the shadowing approach is used for flushing a data item back to disk, then

a the item is written to disk only after the transaction commits

b the item is written to a different location on disk

c the item is written to disk before the transaction commits

d the item is written to the same disk location from which it was read

Selected Bibliography

The books by Bernstein et al (1987) and Papadimitriou (1986) are devoted to the theory

and principles of concurrency control and recovery The book by Gray and Reuter (1993) is

an encyclopedic work on concurrency control, recovery, and other transaction-processing

issues

Verhofstad (1978) presents a tutorial and survey of recovery techniques in database

systems Categorizing algorithms based on theirUNDO/REDOcharacteristics is discussed in

Haerder and Reuter (1983) and in Bernstein et al (1983) Gray (1978) discusses

recov-ery, along with other system aspects of implementing operating systems for databases The

shadow paging technique is discussed in Lorie (1977), Verhofstad (1978), and Reuter

(1980) Gray et al (1981) discuss the recovery mechanism in SYSTEM R.Lockeman and

Knutsen (1968), Davies (1972), and Bjork (1973) are early papers that discuss recovery

Chandy et al (1975) discuss transaction rollback Lilien and Bhargava (1985) discuss the

concept of integrity block and its use to improve the efficiency of recovery

Recovery using write-ahead logging is analyzed in [hingran and Khedkar (1992) and

isused in theARIESsystem (Mohan et al 1992a) More recent work on recovery includes

compensating transactions (Korth et al 1990) and main memory database recovery

(Kumar 1991) TheARIES recovery algorithms (Mohan et al 1992) have been quite

suc-cessful in practice Franklin et al (1992) discusses recovery in the EXODUS system Two

recent books by Kumar and Hsu (1998) and Kumar and Son (1998) discuss recovery in

detail and contain descriptions of recovery methods used in a number of existing

rela-tional database products

Trang 5

OBJECT-RELATIONAL DATABASES

Trang 6

Object Databases

In this chapter and the next, we discuss object-oriented data models and database

sys-terns.' Traditional data models and systems, such as relational, network, and hierarchical,

have been quite successful in developing the database technology required for many

tradi-tional business database applications However, they have certain shortcomings when

more complex database applications must be designed and implemented-for example,

databases for engineering design and manufacturing (CAD/CAM and CIM2), scientific

experiments, telecommunications, geographic information systems, and rnultimedia'

These newer applications have requirements and characteristics that differ from those of

traditional business applications, such as more complex structures for objects,

longer-duration transactions, new data types for storing images or large textual items, and the

need to define nonstandard application-specific operations Object-oriented databases

were proposed to meet the needs of these more complex applications The

object-oriented approach offers the flexibility to handle some of these requirements without

1.These darabases are often referred to as Object Databases and the systems are referred to as

Object Database Management Systems (ODBMS). However, because this chapter discusses many

general object-oriented concepts, wewilluse the termobject-orientedinstead of justobject.

2 Computer-Aided Design/Computer-Aided Manufacturing and Computer-Integrated

Manufac-turing

3.Multimedia databases must store various types of multimedia objects, such as video, audio,

images, graphics, and documents (see Chapter 24)

639

Trang 7

being limited by the data types and query languages available in traditional database tems A key feature of object-oriented databases is the power they give the designer tospecify both the structure of complex objects and the operations that can be applied tothese objects.

sys-Another reason for the creation of object-oriented databases is the increasing use ofobject-oriented programming languages in developing software applications Databasesare now becoming fundamental components in many software systems, and traditionaldatabases were difficult to use with object-oriented software applications that aredeveloped in an object-oriented programming language such as C++, SMALLTALK, orJAVA Object-oriented databases are designed so they can be directly-or seamlessly-

integrated with software that is developed using object-oriented programming languages.The need for additional data modeling features has also been recognized by relationalDBMS vendors, and newer versions of relational systems are incorporating many of thefeatures that were proposed for object-oriented databases This has led to systems that arecharacterized asobject-relationalorextended relationalDBMSs (see Chapter22) The latestversion of the SQL standard for relational DBMSs includes some of these features

Although many experimental prototypes and commercial object-oriented databasesystems have been created, they have not found widespread use because of the popularity

of relational and object-relational systems The experimental prototypes included theORION system developed at MCC,4 OPENOODB at Texas Instruments, the IRIS system atHewlett-Packard laboratories, the ODE system at AT&T Bell Labs.? and the ENCORE!ObServer project at Brown University Commercially available systems includedGEMSTONE/OPAL of GemStone Systems, ONTOS of Ontos, Objectivity of Objectivity Inc.,Versant of Versant Object Technology, ObjectStore of Object Design, ARDENT ofARDENT Software," and POET of POET Software These represent only a partial list of theexperimental prototypes and commercial object-oriented database systems that werecreated

As commercial object-oriented DBMSs became available, the need for a standardmodel and language was recognized Because the formal procedure for approval ofstandards normally takes a number of years, a consortium of object-oriented DBMSvendors and users, called ODMG,7proposed a standard that is known as the ODMG-93standard, which has since been revised We will describe some features of the ODMGstandard in Chapter 21

Object-oriented databases have adopted many of the concepts that were developedoriginally for object-oriented programming languages.f In Section 20.1, we examine theorigins of the object-oriented approach and discuss how it applies to database systems.Then, in Sections20.2through20.6,we describe the key concepts utilized in many object-

4 Microelectronics and Computer Technology Corporation, Austin, Texas

5 Now called Lucent Technologies

6 Formerly 02 of 02 Technology

7 Object Database Management Group

8 Similar concepts were also developed in the fields of semantic data modeling and knowledgerepresentation

Trang 8

oriented database systems Section 20.2 discussesobject identity, object structure, and type

constructors.Section 20.3 presents the concepts ofencapsulation of operationsand definition

ofmethods as part of class declarations, and also discusses the mechanisms for storing

objects in a database by making them persistent. Section 2004 describes type and class

hierarchies and inheritance in object-oriented databases, and Section 20.5 provides an

overview of the issues that arise when complex objects need to be represented and stored

Section 20.6 discusses additional concepts, including polymorphism, operator overloading,

dynamic binding, multiple and selective inheritance,andversioningandconfigurationof objects

This chapter presents the general concepts of object-oriented databases, whereas

Chapter 22 will present the ODMG standard The reader may skip Sections 20.5 and 20.6

ofthis chapter if a less detailed introduction to the topic is desired

CONCEPTS

This section gives a quick overview of the history and main concepts of object-oriented

databases, or OODBs for short The OODB concepts are then explained in more detail in

Sections 20.2 through 20.6 The termobject-oriented-abbreviatedby00or O-O-has its

origins in 00programming languages, or OOPLs Today 00 concepts are applied in the

areas of databases, software engineering, knowledge bases, artificial intelligence, and

com-puter systems in general OOPLs have their roots in the SIMULA language, which was

pro-posed in the late 1960s In SIMULA, the concept of a class groups together the internal

data structure of an object in a class declaration Subsequently, researchers proposed the

concept ofabstractdata type,which hides the internal data structures and specifies all

pos-sible external operations that can be applied to an object, leading to the concept of

encap-sulation. The programming language SMALLTALK, developed at Xerox PARC9 in the

1970s, was one of the first languages to explicitly incorporate additional 00 concepts,

such as message passing and inheritance.Itis known as apure00programming language,

meaning that it was explicitly designed tobe object-oriented This contrasts withhybrid

00programming languages, which incorporate00concepts into an already existing

lan-guage An example of the latter is C++, which incorporates00concepts into the popular

cprogramming language

An object typically has two components; state (value) and behavior (operations)

Hence, it is somewhat similar to aprogram variable in a programming language, except

that it will typically have acomplex data structureas well asspecific operationsdefined by

the programmer.10Objects in an OOPL exist only during program execution and are hence

calledtransient objects. An00database can extend the existence of objects so that they

are stored permanently, and hence the objectspersist beyond program termination and

can be retrieved later and shared by other programs In other words, 00 databases store

9 Palo Alto Research Center, Palo Alto, California

10.Objects have many other characteristics, as we discuss in the rest of this chapter

Trang 9

persistent objectspermanently on secondary storage, and allow the sharing of these objectsamong multiple programs and applications This requires the incorporation of other well-known features of database management systems, such as indexing mechanisms,concurrency control, and recovery An 00database system interfaces with one or more

00programming languages to provide persistent and shared object capabilities

One goal of00 databases is to maintain a direct correspondence between real-worldand database objects so that objects do not lose their integrity and identity and can easily

be identified and operated upon Hence,00 databases provide a unique system-generated

object identifier(OID) for each object We can compare this with the relational model whereeach relation must have a primary key attribute whose value identifies each tuple uniquely

In the relational model, if the value of the primary key is changed, the tuple will have anew identity, even though it may still represent the same real-world object Alternatively,

a real-world object may have different names for key attributes in different relations,making it difficult to ascertain that the keys represent the same object (for example, theobject identifier may be represented asEMP_IDin one relation and asSSNin another).Another feature of 00 databases is that objects may have an object structure of

arbitrary complexity in order to contain all of the necessary information that describes theobject In contrast, in traditional database systems, information about a complex object isoften scattered over many relations or records, leading to loss of direct correspondencebetween a real-world object and its database representation

The internal structure of an object in OOPLs includes the specification of instancevariables, which hold the values that define the internal state of the object Hence, aninstance variable is similar to the concept of an attributein the relational model, exceptthat instance variables may be encapsulated within the object and thus are notnecessarily visible to external users Instance variables may also be of arbitrarily complexdata types Object-oriented systems allow definition of the operations or functions(behavior) that can be applied to objects of a particular type In fact, some 00 modelsinsist that all operations a user can apply to an object must be predefined This forces a

complete encapsulation of objects This rigid approach has been relaxed in most 00datamodels for several reasons First, the database user often needs to know the attributenames so they can specify selection conditions on the attributes to retrieve specificobjects Second, complete encapsulation implies that any simple retrieval requires apredefined operation, thus making ad hoc queries difficult to specify on the fly

To encourage encapsulation, an operation is defined in two parts The first part,called the signature or interface of the operation, specifies the operation name andarguments (or parameters) The second part, called the methodor body, specifies the

implementation of the operation Operations can be invoked by passing amessage to anobject, which includes the operation name and the parameters The object then executesthe method for that operation This encapsulation permits modification of the internalstructure of an object, as well as the implementation of its operations, without the need todisturb the external programs that invoke these operations Hence, encapsulationprovides a form of data and operation independence (see Chapter 2)

Another key concept in00systems is that of type and class hierarchies and inheritance.

This permits specification of new types or classes that inherit much of their structure and/oroperations from previously defined types or classes Hence, specification of object types can

Trang 10

proceed systematically This makes it easier to develop the data types of a system

incrementally, and toreuseexisting type definitions when creating new types of objects

One problem in early 00database systems involved representingrelationshipsamong

objects The insistence on complete encapsulation in early 00data models led to the

argument that relationships should not be explicitly represented, but should instead be

described by defining appropriate methods that locate related objects However, this

approach does not work very well for complex databases with many relationships, because

it is useful to identify these relationships and make them visible to users The ODMG

standard has recognized this need and it explicitly represents binary relationships via a

pairofinversereferences-that is, by placing the OIDs of related objects within the objects

themselves, and maintaining referential integrity, as we shall describe in Chapter 21

Some 00 systems provide capabilities for dealing withmultiple versions of the same

object-a feature that is essential in design and engineering applications For example, an

old version of an object that represents a tested and verified design should be retained

until the new version is tested and verified A new version of a complex object may

include only a few new versions of its component objects, whereas other components

remain unchanged In addition to permitting versioning, 00databases should also allow

forschema evolution,which occurs when type declarations are changed or when new types

or relationships are created These two features are not specific to OODBs and should

ideally be included in all types of DBMSs.11

Another00concept isoperator overloading, which refers to an operation's ability to

be applied to different types of objects; in such a situation, an operation namemay refer to

several distinct implementations, depending on the type of objects it is applied to This

feature is also called operator polymorphism. For example, an operation to calculate the

area of a geometric object may differ in its method (implementation), depending on

whether the object is of type triangle, circle, or rectangle This may require the use oflate

bindingof the operation name to the appropriate method at run-time, when the type of

object to which the operation is applied becomes known

This section provided an overview of the main concepts of00databases In Sections

20.2 through 20.6, we discuss these concepts in more detail

AND TYPE CONSTRUCTORS

In this section we first discuss the concept of object identity, and then we present the

typ-ical structuring operations for defining the structure of the state of an object These

structuring operations are often called type constructors They define basic data-structuring

operations that can be combined to form complex object structures

- - -

-11.Several schema evolution operations, such asALTER TABLE,are already defined in the relational

standard (see Section 8.3)

Trang 11

20.2.1 Object Identity

An00database system provides a unique identity to each independent object stored in thedatabase This unique identity is typically implemented via a unique, system-generated objectidentifier, or DID. The value of an OID is not visible to the external user, but it is usedinternally by the systemtoidentify each object uniquely and to create and manage inter-object references The OlDcan be assigned to program variables of the appropriate typewhen needed

The main property required of an OID is that it be immutable; that is, the OlD value

of a particular object should not change This preserves the identity of the real-worldobject being represented Hence, an 00database system must have some mechanism forgenerating OIDs and preserving the immutability property It is also desirable that eachOID be used only once; that is, even if an object is removed from the database, its OIDshould not be assigned to another object These two properties imply that the OIDshould not depend on any attribute values of the object, since the value of an attributemay be changed or corrected It is also generally considered inappropriate to base theOID on the physical address of the object in storage, since the physical address canchange after a physical reorganization of the database However, some systems do use thephysical address as OID to increase the efficiency of object retrieval If the physicaladdress of the object changes, an indirect pointercan be placed at the former address,which gives the new physical location of the object It is more common to use longintegers as OIDs and then to use some form of hash table to map the OID value tothecurrent physical address of the object in storage

Some early 00 data models required that everything-from a simple value to acomplex object-be represented as an object; hence, every basic value, such as an integer,string, or Boolean value, has an OID This allows two basic values to have different OIDs,which can be useful in some cases For example, the integer value 50 can be used sometimes

to mean a weight in kilograms and at other times to mean the age of a person Then, twobasic objects with distinct OIDs could be created, but both objects would represent theinteger value 50 Although useful as a theoretical model, this is not very practical, since itmay lead to the generation of too many OIDs Hence, most00 database systems allow forthe representation of both objects and values Every object must have an immutable OID,whereas a value has no OIDand just stands for itself Hence, a value is typically stored within

an object andcannot be referencedfrom other objects In some systems, complex structuredvalues can also be created without having a corresponding OID if needed

20.2.2 Object Structure

In 00 databases, the state (current value) of a complex object may be constructed fromother objects (or other values) by using certain type constructors One formal way of rep-resenting such objects is to view each object as a triple(i, c, v), whereiis a uniqueobject identifier(the OlD), c is atype constructor 12(that is, an indication of how the object state is

12.This is different from the constructor operation that is used inc++and other OOPLstocreatenew objects

Trang 12

constructed), and v is the object state (or current value).The data model will typically

include several type constructors The three most basic constructors are atom, tuple, and

set Other commonly used constructors include list, bag, and array The atom

construc-tor is used to represent all basic atomic values, such as integers, real numbers, character

strings, Booleans, and any other basic data types that the system supports directly

The object statevof an object(i,c,v)is interpreted based on the constructor c.Ifc=

atom, the state (value)vis an atomic value from the domain of basic values supported by

the system.Ifc=set, the statevis aset of objectidentifiers {iI' iz, , in},which are the OIDs

for a set of objects that are typically of the same type If c=tuple, the statevis a tuple of

the form<al:il, az:iz, , an:in >,where eacha j is an attribute namel ' and eachi jis an OID

Ifc= list, the valuev is an ordered list [iI' iz, , in]of OIDs of objects of the same type A

list is similar to a set except that the OIDs in a list areordered, and hence we can refer to

the first, second, orlh object in a list For c= array, the state of the object is a

single-dimensional array of object identifiers The main difference between array and list is that

a list can have an arbitrary number of elements whereas an array typically has a maximum

size The difference between setand bagl4is that all elements in a set must be distinct

whereas a bag can have duplicate elements

This model of objects allows arbitrary nesting of the set, list, tuple, and other

constructors The state of an object that is not of type atom will refer to other objects by

their object identifiers Hence, the only case where an actual value appears is inthe state

ofan object of type atom.IS

The type constructors set, list, array, and bag are called collection types (or bulk

types), to distinguish them from basic types and tuple types The main characteristic of a

collection type is that the state of the object will be acollection of objects that may be

unordered (such as a set or a bag) or ordered (such as a list or an array) The tuple type

constructor is often called a structured type, since it corresponds to the struct construct

in theCandc++programming languages

EXAMPLE1: AComplex Object

We now represent some objects from the relational database shown in Figure 5.6, using

the preceding model, where an object is defined by a triple (OID, type constructor, state)

and the available type constuctors are atom, set, and tuple We useii' iz, i 3, •••to stand for

unique system-generated object identifiers Consider the following objects:

01 = (ii' atom, 'Houston')

Oz= (iz,atom, 'Bellaire')

03 = (i 3,atom, 'Sugarland')

13.Also called an instance variable name in00terminology

14 Also called a multiset

15.As we noted earlier, it is not practical to generate a unique system identifier for every value, so

real systems allow for bothOlfrsand structured value, which can be structured by using the same type

constructors as objects, except that a value does not have anaID

Trang 13

04 = (i 4,atom, 5)

05 = (is, atom, 'Research')

06=(i 6,atom, '1988-05-22')

07 =(i7,set, {iI'iz,i3})

Os = (is, tuple,<DNAME:is, DNUMBER:i4, MGR:i9, LOCATIONS:i 7, EMPLOYEES:ilO,PROJECTS:i l l»

09 = (i 9,tuple,<MANAGER:i 12, MANAGER_START_DATE:i6»

010=(i1O'set,{in, i13,i14})

011 = (ill'set filS' i16, in}) 0lZ = (in,tuple,<FNAME:i lS' MINIT:i19, LNAME:i 20, SSN:iZl , ,SALARY:i z6, SUPERVISOR:in ,DEPT:i s»

The first six objects (01-06) listed here represent atomic values There will be many

similar objects, one for each distinct constant atomic value in the database.16Object07

is a set-valued object that represents the set of locations for department 5; the set{iI' iz,

i 3}refers to the atomic objects with values {'Houston', 'Bellaire', 'Sugarland'} ObjectOs

is a tuple-valued object that represents department 5 itself, and has the attributesDNAME, DNUMBER, MGR, LOCATIONS,and so on The first two attributesDNAME and DNUMBERhave atomicobjects Osand 04 as their values The MGR attribute has a tuple object 09 as its value,

which in turn has two attributes The value of the MANAGERattribute is the object whoseOID isin,which represents the employee 'John B Smith' who manages the department,whereas the value ofMANAGER_START_DATEis another atomic object whose value is a date Thevalue of the EMPLOYEESattribute ofOsis a set object withOID =ilO ,whose value is the set ofobject identifiers for the employees who work for theDEPARTMENT(objectsin,plusi13andi14,

which are not shown) Similarly, the value of thePROJECTSattribute ofOsis a set object withOID = ill'whose value is the set of object identifiers for the projects that are controlled bydepartment number 5 (objectsilS' i16,andin'which are not shown) The object whoseOID

=in represents the employee 'John B Smith' with all its atomic attributes (FNAME, MINH, LNAME, SSN, ••• , SALARY,that are referencing the atomic objectsi lS' i19,iZG'iZl' ,iZ6' respect-ively (not shown» plusSUPERVISORwhich references the employee object withOID =in(thisrepresents 'James E Borg' who supervises 'John B Smith' but is not shown) andDEPTwhichreferences the department object withOID=is(this represents department number 5 where'John B Smith' works)

Inthis model, an object can be represented as a graph structure that can be constructed

by recursively applying the type constructors The graph representing an object0i can beconstructed by first creating a node for the object0i itself The node for0i is labeled with theOIDand the object constructor c We also create a node in the graph for each basic atomic

16 These atomic objects are the ones that may cause a problem, due to the use of too many objectidentifiers, if this model is implemented directly

Trang 14

value If an object0ihas an atomic value, we draw a directed arc from the node representing

0i to the node representing its basic value If the object value is constructed, we draw

directed arcs from the object node to a node that represents the constructed value Figure

20.1 shows the graph for the exampleDEPARTMENTobjectOsgiven earlier

The preceding model permits two types of definitions in a comparison of thestates of

two objectsfor equality Two objects are said to have identical states (deep equality) if the

graphs representing their states are identical in every respect, including the OIDs at every

level Another, weaker definition of equality is when two objects have equal states

(shallow equality) In this case, the graph structures must be the same, and all the

corresponding atomic values in the graphs should also be the same However, some

corresponding internal nodes in the two graphs may have objects withdifferent OIDs.

EXAMPLE 2: Identical Versus Equal Objects

A example can illustrate the difference between the two definitions for comparing object

statesfor equality Consider the following objectsOJ'0z,03' 04' 0S, and06:

OJ = (ij , tuple,<aj:i 4, az:i 6»

Oz = (iz,tuple,<aj:is,az:i 6»

03 = (i 3,tuple,<aj:i 4, az:i 6 »

04 = (i 4,atom, 10)

as= (is, atom, 10)

06 = (i 6,atom,20)

The objectsOJand0zhaveequalstates, since their states at the atomic level are the

same but the values are reached through distinct objects04and05. However, the states of

objectsOJand 03are identical, even though the objects themselves are not because they

have distinct OIDs Similarly, although the states of 04 and 05 are identical, the actual

objects04and05 are equal but not identical, because they have distinct OIDs

20.2.3 Type Constructors

An object definition language (ODL)j? that incorporates the preceding type constructors

can be used to define the object types for a particular database application In Chapter21,

we shall describe the standard ODL of ODMG, but we first introduce the concepts gradually

in this section using a simpler notation The type constructors can be used to define the

datastructures for an 00database schema.In Section 20.3we will see how to incorporate

the definition ofoperations (or methods) into the00schema Figure 20.2shows how we

may declare Employee and Department types corresponding to the object instances shown

17 This would correspond to the DDL (Data Definition Language) of the database system (see

Chapter 2)

Trang 15

tuple

PROJECTS EMPLOYEES

i3:~3

atom

v3Sugarland

is: as) + - - - , tuple

"r11: 1 "~12: 2

V 1 v2Houston Bellaire

FIGURE 20.1 Representation of aDEPARTMENTcomplex object as a graph

in Figure 20.1 In Figure 20.2, the Date type is defined as a tuple rather than an atomicvalue as in Figure 20.1 We use the keywords tuple, set, and list for the type constructors,and the available standard data types (integer, string, float, and so on) for atomic types

Trang 16

FIGURE20.2 Specifying the object types Employee, Date, and Department using

typeconstructors

Attributes that refer to other objects-such as dept of Employee or projects of

Department-are basically references to other objects and hence serve to represent

relationshipsamong the object types For example, the attribute dept of Employee is of type

Department, and hence is used to refer to a specific Department object (where the

Employee works) The value of such an attribute would be an OID for a specific Department

object A binary relationship can be represented in one direction, or it can have an inverse

reference. The latter representation makes it easy to traverse the relationship in both

directions For example, the attribute employees of Department has as its value a set of

references (that is, a set of OIDs) to objects of type Employee; these are the employees who

workfor the department The inverse is the reference attribute dept of Employee We will

see in Chapter 21 how the ODMG standard allows inverses to be explicitly declared as

relationship attributestoensure that inverse references are consistent

METHODS, AND PERSISTENCE

The concept ofencapsulationis one of the main characteristics of00 languages and

sys-tems.Itis also relatedtothe concepts ofabstractdatatypesandinformationhiding in

pro-gramming languages In traditional database models and systems, this concept was not

Trang 17

applied, since it is customary to make the structure of database objects visible to users andexternal programs In these traditional models, a number of standard database operationsare applicable to objects of all types For example, in the relational model, the operationsfor selecting, inserting, deleting, and modifying tuples are generic and may be appliedto

any relation in the database The relation and its attributes are visible to users and toexternal programs that access the relation by using these operations

20.3.1 Specifying Object Behavior via Class Operations

The concepts of information hiding and encapsulation can be applied to database objects.The main idea is to define the behavior of a type of object based on the operations thatcan be externally applied to objects of that type The internal structure of the object ishidden, and the object is accessible only through a number of predefined operations.Some operations may be used to create (insert) or destroy (delete) objects; other opera-tions may update the object state; and others may be used to retrieve parts of the objectstate or to apply some calculations Still other operations may perform a combination ofretrieval, calculation, and update In general, the implementation of an operation can bespecified in ageneral-purpose programming languagethat provides flexibility and power indefining the operations

The external users of the object are only made aware of the interface of the objecttype, which defines the name and arguments (parameters) of each operation Theimplementation is hidden from the external users; it includes the definition of theinternal data structures of the object and the implementation of the operations thataccess these structures In00terminology, the interface part of each operation is calledthe signature, and the operation implementation is called a method Typically, a method

is invoked by sending a message to the object to execute the corresponding method.Notice that, as part of executing a method, a subsequent message to another object may

be sent, and this mechanism may be used to return values from the objects to the externalenvironment or to other objects

For database applications, the requirement that all objects be completelyencapsulated is too stringent One way of relaxing this requirement is to divide thestructure of an object into visible and hidden attributes (instance variables) Visibleattributes may be directly accessed for reading by external operators, or by a high-levelquery language The hidden attributes of an object are completely encapsulated and can

be accessed only through predefined operations Most OODBMSs employ high-level querylanguages for accessing visible attributes In Chapter 21, we will describe the OQL querylanguage that is proposed as a standard query language for OODBs

In most cases, operations thatupdatethe state of an object are encapsulated This is away of defining the update semantics of the objects, given that in many 00data models,few integrity constraints are predefined in the schema Each type of object has its integrityconstraints programmed into the methods that create, delete, and update the objects byexplicitly writing code to check for constraint violations and to handle exceptions Insuch cases, all update operations are implemented by encapsulated operations Morerecently, the ODL for the ODMG standard allows the specification of some common

Trang 18

constraints such as keys and inverse relationships (referential integrity) so that the system

can automatically enforce these constraints (see Chapter 21)

The term class is often used to refer to an object type definition, along with the

definitions of the operations for that type.I SFigure 20.3shows how the type definitions of

Figure 20.2 may be extended with operations to define classes A number of operations

are declared for each class, and the signature (interface) of each operation is included in

the class definition A method (implementation) for each operation must be defined

elsewhere, using a programming language Typical operations include the object

constructor operation, which is used to create a new object, and the destructor

operation, which is used to destroy an object A number of object modifier operations can

define class Employee:

type tuple( fname:

define class Department

type tuple( dname: string;

assign_emp(e: Employee): boolean;

(*adds an employee to the department*)

remove_emp(e: Employee): boolean;

(*removes an employee from the department*)

end Department;

FIGURE20.3 Adding operations to the definitions of Employee and Department

18.This definition of class is similar to how it is used in the popular c++ programming language.

TheODMGstandard uses the word interface in additiontoclass(see Chapter21).In theEERmodel,

the term class was usedtorefertoan object type, along with the set of all objects of that type (see

Chapter4)

Trang 19

also be declared to modify the states (values) of various attributes of an object Additionaloperations can retrieve information about the object.

An operation is typically applied to an object by using the dot notation For example,

if d is a reference to a department object, we can invoke an operation such as no_oCemps

by writing d.no_oCemps Similarly, by writing d.destroy_dept, the object referenced byd

is destroyed (deleted) The only exception is the constructor operation, which returns areference to a new Department object Hence, it is customary to have a default name forthe constructor operation that is the name of the class itself, although this was not used inFigure 20.3.19 The dot notation is also used to refer to attributes of an object-forexample, by writing d.dnumber or d.mgr.startdate

20.3.2 Specifying Object Persistence via Naming

The naming mechanism involves giving an object a unique persistent name throughwhich it can be retrieved by this and other programs This persistent object name can begiven via a specific statement or operation in the program, as illustrated in Figure20.4.Allsuch names given to objects must be unique within a particular database Hence, the namedpersistent objects are used as entry points to the database through which users andapplications can start their database access Obviously, it is not practical to give namestoallobjects in a large database that includes thousands of objects, so most objects are madepersistent by using the second mechanism, called reachability The reachability mechanismworks by making the object reachable from some persistent object An object B is said to bereachable from an object A if a sequence of references in the object graph lead from object

A to object B For example, all the objects in Figure 20.1 are reachable from object os;hence, if08is made persistent, all the other objects in Figure20.1also become persistent

Ifwe first create a named persistent object N, whose state is asetor listof objects ofsome classC,we can make objects ofC persistent by addingthemto the set or list, andthus making them reachable from N Hence, N defines a persistent collection of objects

of class C.For example, we can define a class DepartmentSet (see Figure 2004) whoseobjects are of type set(Department).20 Suppose that an object of type DepartmentSet is

19 Default names for the constructor and destructor operations exist in thec++programming

lan-guage For example, for class Employee, the default constructor name is Employee and the default

destructor nameis - Employee It is also commontouse the new operation to create new objects.

20 As we shall see in Chapter 21, the ODMG ODL syntax uses set<Department> instead ofsetf Department)

Trang 20

define classDepartmentSet:

type set(Department);

operations add_dept(d: Department): boolean;

(* adds a department to the DepartmentSet object *)

remove_dept(d: Department): boolean;

(* removes a department from the DepartmentSet object *)

create jieptset: DepartmentSet;

destroydept set: boolean;

endDepartmentSet;

persistent nameAll Departments: DepartmentSet;

(' AIiDepartments is a persistent named object of type DepartmentSet *)

d:=create_dept;

(' create a new Department object in the variable d*)

b:=AIiDepartments.add_dept(d);

(' maked persistent by adding it to the persistent set AllDepartments *)

FIGURE20.4 Creating persistent objects by naming and reachability

created, and suppose that it is named AllDepartments and thus made persistent, as

illustrated in Figure 2004. Any Department object that is added to the set of

AllDepartments by using the add_dept operation becomes persistent by virtue of its being

reachable from AllDepartments The AllDepartments object is often called the extent of

the class Department, as it will hold all persistent objects of type Department As we shall

see in Chapter 21, the ODMG ODL standard gives the schema designer the option of

naming an extent as part of class definition

Notice the difference between traditional database models and00 databases in this

respect In traditional database models, such as the relational model or theEERmodel,all

objects are assumed to be persistent Hence, when an entity type or class such as EMPLOYEE

is defined in the EERmodel, it represents both the type declaration for EMPLOYEE and a

persistent setof allEMPLOYEE objects In the 00 approach, a class declaration of EMPLOYEE

specifies only the type and operations for a class of objects The user must separately

define a persistent object of type set(EMPLOYEE) or list(EMPLOYEE) whose value is thecollection

ofreferences to all persistent EMPLOYEE objects, if this is desired, as illustrated in Figure

20.4.21 This allows transient and persistent objects to follow the same type and class

declarations of theODLand theOOPL.In general, it is possible to define several persistent

collections for the same class definition, if desired

21 Some systems, such as automatically create the extent for a class

Định dạng
Số trang	40
Dung lượng	1,61 MB