Our presentations stresses the fundamentals of database modeling and design, the languages and facilities provided by the database management systems, and system implementation technique
Trang 1., FUNDAMENTALS OF
Trang 3Access the latest information about Addison-Wesley titles from our World Wide Web site:
http://www.aw.com/cs
Figure 12.14 is a logical data model diagram definition in Rational Rose® Figure 12.15 is a cal data model diagram in Rational Rose'", Figure 12.17 is the company database class diagramdrawn in Rational Rose® IBM® has acquired Rational Rose®
graphi-Many of the designations used by manufacturers and sellers to distinguish their products are claimed
as trademarks Where those designations appear in this book, and Addison-Wesley was aware of atrademark claim, the designations have been printed in initial caps or all caps
The programs and applications presented in this book have been included for their instructionalvalue They have been tested with care, but are not guaranteed for any particular purpose The pub-lisher does not offer any warranties or representations, nor does it accept any liabilities with respect
to the programs or applications
Library of Congress Cataloging-in-Publication Data
For information on obtaining permission for the use of material from this work, please submit a ten request to Pearson Education, Inc., Rights and Contracts Department, 75 Arlington St., Suite
writ-300, Boston, MA 02116 or fax your request to 617-848-7047
Copyright©2004 by Pearson Education, Inc
All rights reserved No part of this publication may be reproduced, stored in a retrieval system, ortransmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or other-wise, without the prior written permission of the publisher Printed in the United States of America
1 2 3 4 5 6 7 8 9 lO-HT-06050403
Trang 4To Amalia with love
R E.
To my motherVijaya and wife Aruna for their love and support
S B.N.
Trang 5This book introduces the fundamental concepts necessary for designing, using, and
imple-menting database systems and applications Our presentations stresses the fundamentals
of database modeling and design, the languages and facilities provided by the database
management systems, and system implementation techniques The book is meant tobe
used as a textbook for a one- or two-semester course in database systems at the junior,
senior or graduate level, and as a reference book We assume that the readers are familiar
with elementary programming and data-structuring concepts and that they have had
some exposureto the basic computer organization
We start in Part I with an introduction and a presentation of the basic concepts and
terminology, and database conceptual modeling principles We conclude the book in
Parts 7 and 8 with an introduction to emerging technologies, such as data mining, XML,
security, and Web databases Along the way-in Parts 2 through 6-we provide an
in-depth treatment of the most important aspects of database fundamentals
The following key features are included in the fourth edition:
• The entire book follows a self-contained, flexible organization that can be tailored to
individual needs
• Coverage of data modeling now includes both theERmodel andUML
• A new advancedSQLchapter with material onSQLprogramming techniques, such as
]DBCandSQL/CLl.
Trang 6• A new chapter onXMLand Internet databases.
• A new chapter on data mining
• A significant revision of the supplements to include a robust set of materials forinstructors and students, and an online case study
Main Differences from the Third Edition
There are several organizational changes in the fourth edition, as well as some importantnew chapters The main changes are as follows:
• The chapters on file organizations and indexing (Chapters 5 and 6 in the third tion) have been moved to Part 4, and are now Chapters 13 and 14 Part 4 alsoincludes Chapters 15 and 16 on query processing and optimization, and physicaldatabase design and tuning (this corresponds to Chapter 18 and sections 16.3-16.4 ofthe third edition)
edi-• The relational model coverage has been reorganized and updated in Part 2 Chapter
5 covers relational model concepts and constraints The material on relational bra and calculus is now together in Chapter 6 Relational database design using ER-to-relational and EER-to-relational mapping is in Chapter 7 SQL is covered inChapters 8 and 9, with the new material in SQL programming techniques in sections9.3 through 9.6
alge-• Part 3 covers database design theory and methodology Chapters 10 and lion ization theory correspond to Chapters 14 and 15 of the third edition Chapter 12 onpractical database design has been updated to include more UML coverage
normal-• The chapters on transactions, concurrency control, and recovery (19, 20, 21 in thethird edition) are now Chapters 17, 18, and 19 in Part 5
• The chapters on object-oriented concepts, ODMG object model, and object-relationalsystems (11,12,13 in the third edition) are now 20, 21, and 22 in Part 6 Chapter 22has been reorganized and updated
• Chapters 10 and 17 of the third edition have been dropped The material on server architectures has been merged into Chapters 2 and 25
client-• The chapters on security, enhanced models (active, temporal, spatial, multimedia), anddistributed databases (Chapters 22, 23, 24 in the third edition) are now 23, 24, and 25
in Part 7 The security chapter has been updated Chapter 25 of the third edition ondeductive databases has been merged into Chapter 24, and is now section 24.4
Trang 7• Chapter 26 is a new chapter on XML (eXtended Markup Language), and how it is
related to accessing relational databases over the Internet
• The material on data mining and data warehousing (Chapter 26 of the third edition)
has been separated into two chapters Chaprer 27 on data mining has been expanded
and updated
Contents of This Edition
Part 1 describes the basic concepts necessary for a good understanding of database design
and implementation, as well as the conceptual modeling techniques used in database
sys-tems Chapters 1 and 2 introduce databases, their typical users, and DBMS concepts,
ter-minology, and architecture In Chapter 3, the concepts of the Entity-Relationship (ER)
model and ER diagrams are presented and used to illustrate conceptual database design
Chapter 4 focuses on data abstraction and semantic data modeling concepts and extends
the ER model to incorporate these ideas, leading to the enhanced-ER (EER) data model
and EER diagrams The concepts presented include subclasses, specialization,
generaliza-tion, and union types (categories) The notation for the class diagrams of UML are also
introduced in Chapters 3 and 4
Part 2 describes the relational data model and relational DBMSs Chapter 5 describes
the basic relational model, its integrity constraints and update operations Chapter 6
describes the operations of the relational algebra and introduces the relational calculus
Chapter 7 discusses relational database design using ER and EER-to-relational mapping
Chapter 8 gives a detailed overview of the SQL language, covering the SQL standard,
which is implemented in most relational systems Chapter 9 covers SQL programming
topics such as SQL], JDBC, and SQL/CLI
Part 3 covers several topics related to database design Chapters 10 and 11 cover the
formalisms, theories, and algorithms developed for the relational database design by
nor-malization This material includes functional and other types of dependencies and normal
forms of relarions Step-by-step intuitive normalizarion is presented in Chapter 10, and
relational design algorithms are given in Chapter 11, which also defines other types of
dependencies, such as multivalued and join dependencies Chapter 12 presents an
over-view of the different phases of the database design process for medium-sized and large
applications, using UML
I Part 4 starts with a description of the physical file structures and access methods used
in database systems Chapter 13 describes primary methods of organizing files of records
on disk, including static and dynamic hashing Chapter 14 describes indexing techniques
for files, including B-tree and B+-tree data structures and grid files Chapter 15 introduces
the basics of query processing and optimization, and Chapter 16 discusses physical
data-base design and tuning
Part 5 discusses transaction processing, concurrency control, and recovery
tech-niques, including discussions of how these concepts are realized in SQL
Preface IIX
Trang 8x I Preface
Part 6 gives a comprehensive introduction to object databases and object-relationalsystems Chapter 20 introduces object-oriented concepts Chapter 21 gives a detailedoverview of theODMGobject model and its associatedODL and OQL languages Chapter
22 describes how relational databases are being extended to include object-oriented cepts and presents the features of object-relational systems, as well as giving an overview
con-of some con-of the features con-of theSQL3standard, and the nested relational data model.Parts 7 and 8 cover a number of advanced topics Chapter 23 gives an overview ofdatabase security and authorization, including the SQL commands to GRANT andREVOKE privileges, and expanded coverage on security concepts such as encryption,roles, and flow control Chapter 24 introduces several enhanced database models foradvanced applications These include active databases and triggers, temporal, spatial, mul-timedia, and deductive databases Chapter 25 gives an introduction to distributed data-bases and the three-tier client-server architecture Chapter 26 is a new chapter on XML(eXtended Markup Language) Itfirst discusses the differences between structured, semi-structured, and unstructured models, then presents XML concepts, and finally comparesthe XML model to traditional database models Chapter 27 on data mining has beenexpanded and updated Chapter 28 introduces data warehousing concepts Finally, Chap-ter 29 gives introductions to the topics of mobile databases, multimedia databases, GIS(Geographic Information Systems), and Genome data management in bioinformatics.Appendix A gives a number of alternative diagrammatic notations for displaying a con-ceptualERorEERschema These may be substituted for the notation we use, if the instructor
so wishes Appendix C gives some important physical parameters of disks Appendixes B, E,and F are on the web site Appendix B is a new case study that follows the design and imple-mentation of a bookstore's database Appendixes E and F cover legacy database systems,based on the network and hierarchical database models These have been used for overthirty years as a basis for many existing commercial database applications and transaction-processing systems and will take decades to replace completely We consider it important toexpose students of database management to these long-standing approaches Full chaptersfrom the third edition can be found on the web site for this edition
Guidelines for Using This Book
There are many different ways to teach a database course The chapters in Parts 1 through
5 can be used in an introductory course on database systems in the order that they aregiven or in the preferred order of each individual instructor Selected chapters and sec-tions may be left out, and the instructor can add other chapters from the rest of the book,depending on the emphasis if the course At the end of each chapter's opening section,
we list sections that are candidates for being left out whenever a less detailed discussion ofthe topic in a particular chapter is desired We suggest covering up to Chapter 14 in anintroductory database course and including selected parts of other chapters, depending onthe background of the students and the desired coverage For an emphasis on systemimplementation techniques, chapters from Parts 4 and 5 can be included
Chapters 3 and 4, which cover conceptual modeling using theERandEERmodels, areimportant for a good conceptual understanding of databases However, they may be par-
Trang 9tially covered, covered later in a course, or even left out if the emphasis is onDBMS
imple-mentation Chapters 13 and 14 on file organizations and indexing may also be covered
early on, later, or even left out if the emphasis is on database models and languages For
students who have already taken a course on file organization, parts of these chapters
could be assigned as reading material or some exercises may be assigned to review the
concepts
A total life-cycle database design and implementation project covers conceptual
design (Chapters 3 and 4), data model mapping (Chapter 7), normalization (Chapter
10), and implementation inSQL (Chapter 9) Additional documentation on the specific
RDBMSwould be required
The book has been written so that it is possible to cover topics in a variety of orders
The chart included here shows the major dependencies between chapters As the diagram
illustrates, it is possible to start with several different topics following the first two
intro-ductory chapters Although the chart may seem complex, it is important to note that if
the chapters are covered in order, the dependencies are not lost The chart can be
con-sulted by instructors wishing to use an alternative order of presentation
For a single-semester course based on this book, some chapters can be assigned as
read-ing material Parts 4,7, and 8 can be considered for such an assignment The book can also
Preface IXI
Trang 10xii Preface
\
be used for a two-semester sequence The first course, "Introduction to Database Design/Systems," at the sophomore, junior, or senior level, could cover most of Chapters 1to 14.The second course, "Database Design and Implementation Techniques," at the senior orfirst-year graduate level, can cover Chapters 15 to 28 Chapters from Parts 7 and 8 can beused selectively in either semester, and material describing the DBMS available to the stu-dents at the local institution can be covered in addition to the material in the book
Supplemental Materials
The supplements to this book have been significantly revised With Addison-Wesley'sDatabase Place there is a robust set of interactive reference materials to help studentswith their study of modeling, normalization, and SQL Each tutorial asks students to solveproblems (such as writing an SQL query, drawing an ER diagram or normalizing a rela-tion), and then provides useful feedback based on the student's solution Addison-Wesley's Database Place helps students master the key concepts of all database courses.For more information visitaw.corn/databaseplace
In addition the following supplements are available to all readers of this book atwww.aw.com/cssupport
• Additional content: This includes a new Case Study on the design and tion of a bookstore's database as well as chapters from previous editions that are notincluded in the fourth edition
implementa-• A set of PowerPoint lecture notes
A solutions manual is also available to qualified instructors Please contact your localAddison- Wesley sales representative, or send e-mail to aw.cseteaw.com, for information
on howtoaccess it
Acknowledgements
It is a great pleasure for us to acknowledge the assistance and contributions of a large ber of individuals to this effort First, we would like to thank our editors, Maite Suarez-Rivas, Katherine Harutunian, Daniel Rausch, and Juliet Silveri In particular we would like
num-to acknowledge the efforts and help of Katherine Harutunian, our primary contact for thefourth edition We would like to acknowledge also those persons who have contributed tothe fourth edition We appreciated the contributions of the following reviewers: Phil Bern-hard,Florida Tech; Zhengxin Chen,University of Nebraska at Omaha;Jan Chomicki,Univer-
Ramez Elmasri would like to thank his students Hyoil Han, Babak Hojabri, Jack Fu,CharleyLi, Ande Swathi, and Steven Wu, who contributed to the material in Chapter
Trang 1126 He would also like to acknowledge the support provided by the University of Texas at
Arlington
Sham Navathe would like to acknowledge Dan Forsythe and the following students
at Georgia Tech: Weimin Feng, Angshuman Guin, Abrar Ul-Haque, Bin Liu, Ying Liu,
Wanxia Xie and Waigen Yee
We would like to repeat our thanks to those who have reviewed and contributed to
ptevious editions ofFundamentals of Database Systems For the first edition these
individu-als include Alan Apt (editor), Don Batory, Scott Downing, Dennis Heimbinger, Julia
Hodges, Yannis Ioannidis, Jim Larson, Dennis McLeod, Per-Ake Larson, Rahul Patel,
Nicholas Roussopoulos, David Stemple, Michael Stonebraker, Frank Tampa, and
Kyu-Young Whang; for the second edition they include Dan [oraanstad (editor), Rafi Ahmed,
Antonio Albano, David Beech, Jose Blakeley, Panos Chrysanthis, Suzanne Dietrich, Vic
Ghorpadey, Goets Graefe, Eric Hanson, [ungukL.Kim, Roger King, Vram Kouramajian,
Vijay Kumar, John Lowther, Sanjay Manchanda, Toshimi Minoura, Inderpal Mumick, Ed
Omiecinski, Girish Pathak, Raghu Rarnakrishnan, Ed Robertson, Eugene Sheng, David
Stotts, Marianne Winslett, and Stan Zdonick For the third edition they include Suzanne
Dietrich, Ed Omiecinski, Rafi Ahmed, Francois Bancilhon, Jose Blakeley, Rick Cattell,
Ann Chervenak, David W Embley, Henry A. Edinger, Leonidas Fegaras, Dan Forsyth,
Farshad Fotouhi, Michael Franklin, Sreejith Gopinath, Goetz Craefe, Richard Hull,
Sushil [ajodia, Ramesh K Kame, Harish Kotbagi, Vijay Kumar, Tarcisio Lima, RamonA.
Mara-Toledo, Jack McCaw, Dennis McLeod, Rokia Missaoui, Magdi Morsi, M
Naraya-naswamy, Carlos Ordonez, Joan Peckham, Betty Salzberg, Ming-Chien Shan, [unping
Sun, Rajshekhar Sunderraman, Aravindan Veerasamy, and Emilia E Villareal
Last but not l,ast, we gratefully acknowledge the support, encouragement, and
patience of our families
R.E.
S.B.N.
Preface IXIII
Trang 12PART 1 INTRODUCTION AND CONCEPTUAL MODELING
CHA'1JTER 1 Databases and Database Users 3
1.1 Introduction 4
1.2 An Example 6
1.3 Characteristics of the Database Approach 8
1.4 Actors on the Scene 12
1.5 Workers behind the Scene 14
1.6 Advantages of Using the DBMS Approach 15
1.7 A Brief History of Database Applications 20
1.8 When Not to Use aDBMS 23
Trang 132.4 The Database System Environment 35 2.5 Centralized and Client/Server Architectures for DBMSs 38 2.6 Classification of Database Management Systems 43
2.7 Summary 45 Review Questions 46 Exercises 46
3.5 Weak Entity Types 68 3.6 Refining theERDesign for the COMPANYDatabase 69 3.7 ERDiagrams, Naming Conventions, and Design Issues 70 3.8 Notation for UML Class Diagrams 74
3.9 Summary 77 Review Questions 78 Exercises 78
Trang 14Contents I xvii
4.6 Representing Specialization/Generalization and Inheritance in UML
Class Diagrams 104
4.7 Relationship Types of Degree Higher Than Two 105
4.8 Data Abstraction, Knowledge Representation, and Ontology
PART 2 RELATIONAL MODEL: CONCEPTS, CONSTRAINTS,
LANGUAGES, DESIGN, AND PROGRAMMING
CHAPTER 5 The Relational Data Model and
5.1 Relational Model Concepts 126
5.2 Relational Model Constraints and Relational Database
171
189 185
CHAPTER 6 The Relational Algebra and Relational
6.1 Unary Relational Operations:SELECT and PROJECT
6.2 Relational Algebra Operations from Set Theory
6.3 Binary Relational Operations:JOIN and DIVISION
6.4 Additional Relational Operations 165
6.5 Examples of Queries in Relational Algebra
6.6 The Tuple Relational Calculus 173
6.7 The Domain Relational Calculus 181
6.8 Summary 184
Review Questions
Exercises 186
Selected Bibliography
Trang 15xviii Contents
CHAPTER 7 Relational Database Design by
7.1 Relational Database Design Using ER-to-Relational Mapping 192
7.2 Mapping EER Model Constructs to Relations 199 7.3 Summary 203
Review Questions 204 Exercises 204
Selected Bibliography 205
CHAPTER 8 sQL 99: Schema Definition,
Basic Constraints, and Queries 207
8.1 SQL Data Definition and Data Types 209 8.2 Specifying Basic Constraints in SQL 213 8.3 Schema Change Statements in SQL 217 8.4 Basic Queries in SQL 218
8.5 More ComplexSQLQueries 229 8.6 Insert, Delete, and Update Statements inSQL 245 8.7 Additional Features ofSQL 248
8.8 Summary 249 Review Questions 251 Exercises 251
9.6 Database Stored Procedures andSQL/PSM 284 9.7 Summary 287
Review Questions 287 Exercises 287
Selected Bibliography 289
Trang 16PART 3 DATABASE DESIGN THEORY AND METHODOLOGY
CHAPTER 10 Functional Dependencies and
Normalization for Relational Databases 293
10.1 Informal Design Guidelines for Relation Schemas 295
10.2 Functional Dependencies 304
10.3 Normal Forms Based on Primary Keys 312
10.4 General Definitions of Second and Third Normal Forms 320
CHAPTER 11 Relational Database Design
Algorithms and Further Dependencies
11.1 Properties of Relational Decompositions 334
11.2 Algorithmsfor Relational Database Schema Design
11.3 Multivalued Dependencies and Fourth Normal Form
11.4 Join Dependencies and Fifth Normal Form 353
CHAPTER 12 Practical Database Design Methodology
and Use of UML Diagrams 361
12.1 The Role ofInformation Systems in Organizations 362
12.2 The Database Design and Implementation Process 366
12.3 Use ofUML Diagrams as an Aid to Database Design
Specification 385
12.4 Rational Rose, A UML Based Design Tool 395
12.5 Automated Database Design Tools 402
Selected Bibliography 406
Trang 17415
454 450
CHAPTER 13 Disk Storage, Basic File Structures, and
Hashing 411
13.1 Introduction 412 13.2 Secondary Storage Devices 13.3 Buffering of Blocks 421 13.4 Placing File Records on Disk 13.5 Operations on Files 427 13.6 Files of Unordered Records (Heap Files) 13.7 Files of Ordered Records (Sorted Files)
13.9 Other Primary File Organizations 442 13.10 Parallelizing Disk Access Using RAIDTechnology 13.11 Storage Area Networks 447
Review Questions
Selected Bibliography
CHAPTER 14 Indexing Structures for Files 455
14.1 Types of Single- Level Ordered Indexes 456 14.2 Multilevel Indexes 464
14.3 Dynamic Multilevel Indexes Using B-Trees and W-Trees 469 14.4 Indexes on Multiple Keys 483
14.5 Other Types ofIndexes 485
Trang 1815.5 Implementing Aggregate Operations and Outer Joins 509
15.6 Combining Operations Using Pipe lining 511
15.7 Using Heuristics in Query Optimization 512
15.8 Using Selectivity and Cost Estimates in Query Optimization 523
15.9 Overview of Query Optimization inORACLE 532
15.10 Semantic Query Optimization 533
15.11 Summary 534
Review Questions 534
Exercises 535
Selected Bibliography 536
CHAPTER 16 Practical Database Design and Tuning 537
16.1 Physical Database Design in Relational Databases 537
16.2 An Overview of Database Tuning in Relational Systems 541
CHAPTER 1 7 Introduction to Transaction
Processing Concepts and Theory
17.1 Introduction to Transaction Processing 552
17.2 Transaction and System Concepts 559
17.3 Desirable Properties of Transactions 562
17.4 Characterizing Schedules Based on Recoverability
17.5 Characterizing Schedules Based on Serializability
CHAPTER 18 Concurrency Control Techniques 583
18.1 Two-Phase Locking Techniques for Concurrency Control 584
18.2 Concurrency Control Based on Timestamp Ordering 594
18.3 Multiversion Concurrency Control Techniques 596
18.4 Validation (Optimistic) Concurrency Control Techniques 599
Trang 19Selected Bibliography 609
CHAPTER 19 Database Recovery Techniques 611
19.1 Recovery Concepts 612 19.2 Recovery Techniques Based on Deferred Update 618 19.3 Recovery Techniques Based on Immediate Update 622
19A Shadow Paging 624 19.5 TheARIES Recovery Algorithm 625 19.6 Recovery in Multidatabase Systems 629 19.7 Database Backup and Recovery from Catastrophic Failures 630 19.8 Summary 631
Review Questions 632 Exercises 633
Selected Bibliography 635
PART 6 OBJECT AND OBJECT-RELATIONAL DATABASES
CHAPTER 20 Concepts for Object Databases 639
20.1 Overview of Object-Oriented Concepts 641 20.2 Object Identity, Object Structure, and Type Constructors 20.3 Encapsulation of Operations, Methods, and Persistence
20A Type and Class Hierarchies and Inheritance 654 20.5 Complex Objects 657
20.6 Other Objected-Oriented Concepts 659 20.7 Summary 662
Review Questions 663 Exercises 664
Selected Bibliography 664
643 649
CHAPTER 21 Object Database Standards, Languages, and
Design 665
21.1 Overview of the Object Model ofODMG 666
Trang 20Contents IXX/II
21.2 The Object Definition Language ODL 679
21.3 The Object Query Language OQL 684
21.4 Overview of the c++ Language Binding 693
21.5 Object Database Conceptual Design 694
725 728
CHAPTER 22 Object-Relational and Extended-Relational
Systems 701
22.1 Overview ofSQL and Its Object-Relational Features
22.2 Evolution and Current Trends of Database Technology
22.3 The Informix Universal Server 711
22.4 Object-Relational Features of Oracle 8 721
22.5 Implementation and Related Issues for Extended Type
Systems 724
22.6 The Nested Relational Model
22.7 Summary 727
Selected Bibliography
PART 7 FURTHER TOPICS
CHAPTER 23 Database Security and Authorization 731
23.1 Introduction to Database Security Issues 732
23.2 Discretionary Access Control Based on Granting and Revoking
Privileges 735
23.3 Mandatory Access Control and
Role- Based Access Control for Multilevel Security 740
23.4 Introduction to Statistical Database Security 746
23.5 Introduction to Flow Control 747
23.6 Encryption and Public Key Infrastructures 749
23.7 Summary 751
Review Questions 752
Exercises 753
Selected Bibliography 753