The history of database research over the past 30 years is one of exceptional productivity that has led to the database system becoming arguably the most important development in the field of software engineering. The database is now the underlying framework of the information system and has fundamentally changed the way many organizations operate. In particular, the developments in this technology over the last few years have produced systems that are more powerful and more intuitive to use. This development has resulted in increasing availability of database systems for a wider variety of users. Unfortunately, the apparent simplicity of these systems has led to users creating databases and applications without the necessary knowledge to produce an effective and efficient system. And so the “software crisis” or, as it is sometimes referred to, the “software depression” continues. The original stimulus for this book came from the authors’ work in industry, providing consultancy on database design for new software systems or, as often as not, resolving inadequacies with existing systems. In addition, the authors’ move to academia brought similar problems from different users—students. The objectives of this book, therefore, are to provide a textbook that introduces the theory behind databases as clearly as possible and, in particular, to provide a methodology for database design that can be used by both technical and nontechnical readers.
Trang 1this is a special edition of an established
title widely used by colleges and universities
throughout the world Pearson published this
exclusive edition for the benefit of students
outside the United States and Canada If you
purchased this book within the United States
or Canada you should be aware that it has
been imported without the approval of the
Publisher or author
Pearson Global Edition
GloBal eDItIon
For these Global Editions, the editorial team at Pearson has
collaborated with educators across the world to address a
wide range of subjects and requirements, equipping students
with the best possible learning tools This Global Edition
preserves the cutting-edge approach and pedagogy of the
original, but also features alterations, customization and
adaptation from the North american version.
Trang 2Thank you for purchasing a new copy of Database Systems , Sixth Edition Your textbook
includes one year of prepaid access to the book’s Companion Website This prepaid
subscription provides you with full access to the following student support areas:
• online appendices
• tutorials on selected chapters
• DreamHome web implementation
Use a coin to scratch off the coating and reveal your student access code.
Do not use a knife or other sharp object as it may damage the code.
will need to register online using a computer with an Internet connection and a web browser
The process takes just a couple of minutes and only needs to be completed once.
1 Go to www.pearsonglobaleditions.com/connolly
2 Click on Companion Website.
3 Click on the Register button.
4 On the registration page, enter your student access code* found beneath the scratch-off
panel Do not type the dashes You can use lower- or uppercase.
5 Follow the on-screen instructions If you need help at any time during the online
6 Once your personal Login Name and Password are confirmed, you can begin using the
Database Systems Companion Website!
To log in after you have registered:
You only need to register for this Companion Website once After that, you can log in any
Password when prompted.
*Important: The access code can only be used once This subscription is valid for one year
upon activation and is not transferable If this access code has already been revealed,
it may no longer be valid If this is the case, you can purchase a subscription by going
instructions.
Trang 3Database systems
A Practical Approach to Design, Implementation, and Management
SIXth EDItIon GlobAl EDItIon
Trang 5Database systems
A Practical Approach to Design, Implementation, and Management
SIXth EDItIon GlobAl EDItIon
University of the west of sCotlanD
Boston Columbus Indianapolis New York San Francisco Upper Saddle River Amsterdam Cape Town Dubai London Madrid Milan Munich Paris Montréal Toronto Delhi Mexico City São Paulo Sydney Hong Kong Seoul Singapore TaiPei Tokyo
Trang 6Acquisition, Global Editions: Laura Dent
Acquisitions Editor: Matt Goldstein
Acquisitions Editor,
Global Editions: Subhasree Patra
Program Manager: Kayla Smith-Tarbox
Director of Marketing: Christy Lesko
Marketing Manager: Yezan Alayan
Marketing Assistant: Jon Bryant
Director of Production: Erin Gregg
Senior Managing Editor: Scott Disanno
Senior Project Manager: Marilyn Lloyd
Media Producer, Global Editions: M Vikram Kumar
Project Editor, Global Editions: K.K Neelakantan
Manufacturing Buyer: Linda Sager Art Director: Jayne Conte Cover Designer: Lumina Datamatics Text Designer: Susan Raymond Manager, Text Permissions: Tim Nicholls Text Permission Project Manager: Jenell Forschler Cover Image: © Africa Studio/
Shutterstock Media Project Manager: Renata Butera Full-Service Project Management: Vasundhara Sawhney/
Cenveo ® Publisher Services
Credits and acknowledgments borrowed from other sources and reproduced, with permission, in this textbook appear
on the Credits page at the end of the book.
Pearson Education Limited
Edinburgh Gate
Harlow
Essex CM20 2JE
England
and Associated Companies throughout the world
Visit us on the World Wide Web at:
www.pearsonglobaleditions.com
© Pearson Education Limited 2015
The rights of Thomas Connolly and Carolyn Begg to be identified as the authors of this work have been asserted by
them in accordance with the Copyright, Designs and Patents Act 1988.
Authorized adaptation from the United States edition, entitled Database Systems: A Practical Approach to Design, Implementation, and
Management, 6th edition, ISBN 978-0-13-294326-0, by Thomas Connolly and Carolyn Begg, published by Pearson Education © 2015.
All rights reserved No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any
form or by any means, electronic, mechanical, photocopying, recording or otherwise, without either the prior written
permission of the publisher or a license permitting restricted copying in the United Kingdom issued by the Copyright
Licensing Agency Ltd, Saffron House, 6–10 Kirby Street, London EC1N 8TS.
All trademarks used herein are the property of their respective owners The use of any trademark in this text does not
vest in the author or publisher any trademark ownership rights in such trademarks, nor does the use of such trademarks
imply any affiliation with or endorsement of this book by such owners.
British Library Cataloguing-in-Publication Data
A catalogue record for this book is available from the British Library
10 9 8 7 6 5 4 3 2 1
15 14 13 12 11
ISBN 10: 1-292-06118-9
ISBN 13: 978-1-292-06118-4
Typeset by Cenveo Publishing Services.
Printed and bound by Courier Westford in The United States of America.
Trang 7To Sheena, for her patience, understanding, and love.
To our beautiful children Kathryn, Michael and Stephen with all our love
And to my brother, Francis, who died during the writing of this book
Thomas M Connolly
To my past, present, and future students at UWS
Carolyn E Begg
Trang 9Preface 35
Chapter 3 Database Architectures and the Web 105
Chapter 5 Relational Algebra and Relational Calculus 167
Chapter 10 Database System Development Lifecycle 345
Chapter 11 Database Analysis and the DreamHome Case Study 375
Chapter 13 Enhanced Entity–Relationship Modeling 433
Chapter 16 Methodology—Conceptual Database Design 503
Chapter 17 Methodology—Logical Database Design
Brief Contents
7
Trang 10Chapter 18 Methodology—Physical Database Design
Chapter 19 Methodology—Monitoring and Tuning
Chapter 21 Professional, Legal, and Ethical Issues in Data
Chapter 24 Distributed DBMSs—Concepts and Design 785
Chapter 25 Distributed DBMSs—Advanced Concepts 831
Chapter 26 Replication and Mobile Databases 875
Chapter 27 Object-Oriented DBMSs—Concepts and Design 941
Chapter 28 Object-Oriented DBMSs—Standards and Systems 995
Trang 11Brief Contents | 9
A Users’ Requirements Specification for DreamHome Case Study A-1
D Summary of the Database Design Methodology
E Introduction to Pyrrho: A Lightweight RDBMS E-1
F File Organizations and Indexes (Online) F-1
H Commercial DBMSs: Access® and Oracle® (Online) H-1
J Estimating Disk Space Requirements (Online) J-1
K Introduction to Object-Oriented Concepts (Online) K-1
References R-1
Index IN-1
Trang 131.4.1 Data and Database Administrators 69
2.1 The Three-Level ANSI-SPARC Architecture 84
Trang 142.2.2 The Data Manipulation Language (DML) 902.2.3 Fourth-Generation Languages (4GLs) 922.3 Data Models and Conceptual Modeling 93
Part 2 The Relational Model and Languages 147
4.1 Brief History of the Relational Model 150
Trang 166.3 Data Manipulation 196
6.3.2 Sorting Results (ORDER BY Clause) 2056.3.3 Using the SQL Aggregate Functions 2076.3.4 Grouping Results (GROUP BY Clause) 209
6.3.9 Combining Result Tables (UNION, INTERSECT,
7.3.5 Creating an Index (CREATE INDEX) 2507.3.6 Removing an Index (DROP INDEX) 250
7.4.7 Advantages and Disadvantages of Views 258
Trang 17Contents | 15
7.5.1 Immediate and Deferred Integrity Constraints 262
7.6.1 Granting Privileges to Other Users (GRANT) 2647.6.2 Revoking Privileges from Users (REVOKE) 265
9.3 Storing Objects in a Relational Database 302
9.3.2 Accessing Objects in the Relational Database 3049.4 Introduction to Object-Relational Database Systems 305
Trang 189.6 Object-Oriented Extensions in Oracle 331
Part 3 Database Analysis and Design 343
10.2 The Database System Development Lifecycle 347
Trang 1911.1 When Are Fact-Finding Techniques Used? 376
11.4 Using Fact-Finding Techniques: A Worked -Example 381
11.4.1 The DreamHome Case Study—An Overview of the
11.4.2 The DreamHome Case Study—Database Planning 386
11.4.3 The DreamHome Case Study—System Definition 392
11.4.4 The DreamHome Case Study—Requirements
11.4.5 The DreamHome Case Study—Database Design 401
Trang 2012.6 Structural Constraints 41912.6.1 One-to-One (1:1) Relationships 42012.6.2 One-to-Many (1:*) Relationships 42112.6.3 Many-to-Many (*:*) Relationships 42212.6.4 Multiplicity for Complex Relationships 42312.6.5 Cardinality and Participation Constraints 424
Generalization to Model the Branch View of the
14.2 How Normalization Supports Database Design 45314.3 Data Redundancy and Update Anomalies 454
Trang 2114.6 First Normal Form (1NF) 466
14.9 General Definitions of 2NF and 3NF 473
15.1.1 Inference Rules for Functional Dependencies 48215.1.2 Minimal Sets of Functional Dependencies 484
15.3 Review of Normalization Up to BCNF440
15.4.2 Definition of Fourth Normal Form 495
16.1 Introduction to the Database Design Methodology 50416.1.1 What Is a Design Methodology? 50416.1.2 Conceptual, Logical, and Physical Database Design 50516.1.3 Critical Success Factors in Database Design 50516.2 Overview of the Database Design Methodology 50616.3 Conceptual Database Design Methodology 508 Step 1: Build Conceptual Data Model 508
Contents | 19
Trang 22Chapter 17 Methodology—Logical Database Design
17.1 Logical Database Design Methodology for
Chapter 18 Methodology—Physical Database Design
18.1 Comparison of Logical and Physical Database Design 56218.2 Overview of the Physical Database Design Methodology 56318.3 The Physical Database Design Methodology for
Step 3: Translate Logical Data Model for Target DBMS 564Step 4: Design File Organizations and Indexes 569
Chapter 19 Methodology—Monitoring and Tuning
19.1 Denormalizing and Introducing Controlled Redundancy 585 Step 7: Consider the Introduction of Controlled
Part 5 Selected Database Issues 605
Trang 2421.2.6 Access to Information Laws 65221.2.7 International Banking—Basel II Accords 65421.3 Establishing a Culture of Legal and Ethical
21.3.1 Developing an Organization-Wide Policy for Legal
21.3.2 Professional Organizations and Codes of Ethics 65621.3.3 Developing an Organization-Wide Policy for Legal
22.3.5 Recovery in a Distributed DBMS 709
Trang 2522.5 Concurrency Control and Recovery in Oracle 716
22.5.2 Multiversion Read Consistency 717
(T = R S, T = R S, T = R – S) 75923.5 Enumeration of Alternative Execution Strategies 760
23.5.3 Physical Operators and Execution Strategies 762
23.5.7 Alternative Approaches to Query Optimization 76723.5.8 Distributed Query Optimization 768
23.7.1 Rule-Based and Cost-Based Optimization 772
Trang 26Part 6 Distributed DBMSs and Replication 783
25.1 Distributed Transaction Management 832
25.4.1 Failures in a Distributed Environment 841
25.5 The X/Open Distributed Transaction Processing Model 854
Trang 2726.3.5 Update Anywhere with Uniform
26.3.6 SI and Uniform Total Order Broadcast Replication 907
Contents | 25
Trang 2827.2 Introduction to OODBMSs 94527.2.1 Definition of Object-Oriented DBMSs 945
27.2.3 Persistent Programming Languages 95127.2.4 Alternative Strategies for Developing an OODBMS 953
27.7.1 Comparison of Object-Oriented Data Modeling
27.7.2 Relationships and Referential Integrity 980
27.8 Object-Oriented Analysis and Design with UML 984
27.8.2 Usage of UML in the Methodology
28.1.2 The Common Object Request Broker Architecture 999
Trang 2928.2.3 The Object Definition Language 1018
28.2.5 Other Parts of the ODMG Standard 102728.2.6 Mapping the Conceptual Design to a Logical
Part 8 The Web and DBMSs 1045
29.1 Introduction to the Internet and the Web 1048
29.2.4 Static and Dynamic Web Pages 1058
29.4.1 Passing Information to a CGI Script 106929.4.2 Advantages and Disadvantages of CGI 1071
Contents | 27
Trang 3029.7.3 Comparison of JDBC and SQLJ 108429.7.4 Container-Managed Persistence (CMP) 1085
29.8.2 Active Server Pages and ActiveX Data Objects 1109
30.3.5 XPointer (XML Pointer Language) 114930.3.6 XLink (XML Linking Language) 1150
30.3.8 Simple Object Access Protocol (SOAP) 115130.3.9 Web Services Description Language (WSDL) 1152
Trang 3130.3.10 Universal Discovery, Description, and
30.5.1 Extending Lore and Lorel to Handle XML 1167
30.5.3 XQuery—A Query Language for XML 1169
30.5.5 XQuery 1.0 and XPath 2.0 Data Model (XDM) 1180
Part 9 Business Intelligence 1221
31.1.1 The Evolution of Data Warehousing 1224
31.1.3 Benefits of Data Warehousing 122631.1.4 Comparison of OLTP Systems
31.1.5 Problems of Data Warehousing 1228
Trang 3231.3 Data Warehousing Tools and Technologies 123531.3.1 Extraction, Transformation, and Loading (ETL) 1236
31.3.4 Administration and Management Tools 1242
31.4.1 Reasons for Creating a Data Mart 124331.5 Data Warehousing and Temporal Databases 124331.5.1 Temporal Extensions to the SQL Standard 1246
31.6.1 Warehouse Features in Oracle 11g 125131.6.2 Oracle Support for Temporal Data 1252
32.1 Designing a Data Warehouse Database 125832.2 Data Warehouse Development Methodologies 125832.3 Kimball’s Business Dimensional Lifecycle 1260
32.4.1 Comparison of DM and ER models 126432.5 The Dimensional Modeling Stage of Kimball’s
32.5.1 Create a High-Level Dimensional Model
32.5.2 Identify All Dimension Attributes for the
32.6 Data Warehouse Development Issues 127332.7 Data Warehousing Design Using Oracle 127432.7.1 Oracle Warehouse Builder Components 127432.7.2 Using Oracle Warehouse Builder 1275
32.7.3 Warehouse Builder Features in Oracle 11g 1279
Trang 3333.2 OLAP Applications 1287
33.3.1 Alternative Multidimensional Data Representations 128933.3.2 Dimensional Hierarchy 129133.3.3 Multidimensional Operations 1293
33.4.2 OLAP Server—Implementation Issues 1295
33.5 OLAP Extensions to the SQL Standard 130033.5.1 Extended Grouping Capabilities 1300
33.6.2 Platform for Business Intelligence
Contents | 31
Trang 3434.6.2 Enabling Data Mining Applications 1325
34.6.4 Oracle Data Mining Environment 1326
34.6.5 Data Mining Features in Oracle 11g 1327
A Users’ Requirements Specification
A.1.2 Transaction Requirements (Sample) A-3
A.2.2 Transaction Requirements (Sample) A-5
B.1 The University Accommodation Office Case Study B-1
B.3.2 Transaction Requirements (Sample) B-12
C.1 ER Modeling Using the Chen Notation C-1C.2 ER Modeling Using the Crow’s Feet Notation C-1
D Summary of the Database Design Methodology
Step 3: Translate Logical Data Model for Target DBMS D-5Step 4: Design File Organizations and Indexes D-5
Trang 35Step 5: Design User Views D-5
Step 7: Consider the Introduction of Controlled
Step 8: Monitor and Tune the Operational System D-6
H Commercial DBMSs: Access and Oracle
J Estimating Disk Space Requirements
K Introduction to Object-Oriented Concepts
References R-1
Index IN-1
Contents | 33
Trang 37The history of database research over the past 30 years is one of exceptional productivity that has led to the database system becoming arguably the most important development in the field of software engineering The database is now the underlying framework of the information system and has fundamentally changed the way many organizations operate In particular, the developments
in this technology over the last few years have produced systems that are more powerful and more intuitive to use This development has resulted in increas-ing availability of database systems for a wider variety of users Unfortunately, the apparent simplicity of these systems has led to users creating databases and applications without the necessary knowledge to produce an effective and effi-cient system And so the “software crisis” or, as it is sometimes referred to, the
“software depression” continues
The original stimulus for this book came from the authors’ work in industry, providing consultancy on database design for new software systems or, as often as not, resolving inadequacies with existing systems In addition, the authors’ move to academia brought similar problems from different users—students The objectives
of this book, therefore, are to provide a textbook that introduces the theory behind databases as clearly as possible and, in particular, to provide a methodology for database design that can be used by both technical and nontechnical readers
The methodology presented in this book for relational Database Management Systems (DBMSs)—the predominant system for business applications at present—has been tried and tested over the years in both industrial and academic environments It consists of three main phases: conceptual, logical, and physical database design The first phase starts with the production of a conceptual data model that is independent of all physical considerations This model is then refined
in the second phase into a logical data model by removing constructs that cannot
be represented in relational systems In the third phase, the logical data model is translated into a physical design for the target DBMS The physical design phase considers the storage structures and access methods required for efficient and secure access to the database on secondary storage
Preface
35
Trang 38The methodology in each phase is presented as a series of steps For the inexperienced designer, it is expected that the steps will be followed in the order described, and guidelines are provided throughout to help with this process For the experienced designer, the methodology can be less prescriptive, acting more as
a framework or checklist To help the reader use the methodology and understand the important issues, the methodology has been described using a realistic worked
example, based on an integrated case study, DreamHome In addition, three
additional case studies are provided in Appendix B to allow readers to try out the methodology for themselves
UML (Unified Modeling Language)
Increasingly, companies are standardizing the way in which they model data
by selecting a particular approach to data modeling and using it throughout their database development projects A popular high-level data model used in conceptual/logical database design, and the one we use in this book, is based
on the concepts of the Entity–Relationship (ER) model Currently there is no standard notation for an ER model Most books that cover database design for relational DBMSs tend to use one of two conventional notations:
• Chen’s notation, consisting of rectangles representing entities and diamonds representing relationships, with lines linking the rectangles and diamonds; or
• Crow’s Feet notation, again consisting of rectangles representing entities and lines between entities representing relationships, with a crow’s foot at one end of
a line representing a one-to-many relationship
Both notations are well supported by current Computer-Aided Software Engineering (CASE) tools However, they can be quite cumbersome to use and a bit difficult to explain In previous editions, we used Chen’s notation However, following an extensive questionnaire carried out by Pearson Education, there was a general consensus that the notation should be changed to the latest object-oriented modeling language, called UML (Unified Modeling Language) UML
is a notation that combines elements from the three major strands of oriented design: Rumbaugh’s OMT modeling, Booch’s Object-Oriented Analysis and Design, and Jacobson’s Objectory
object-There are three primary reasons for adopting a different notation: (1) UML
is becoming an industry standard; for example, the Object Management Group (OMG) has adopted UML as the standard notation for object methods; (2) UML
is arguably clearer and easier to use; and (3) UML is now being adopted within academia for teaching object-oriented analysis and design, and using UML in database modules provides more synergy Therefore, in this edition we have ad-opted the class diagram notation from UML We believe that you will find this notation easier to understand and use
Trang 39Preface | 37 What’s New in the Sixth Edition
• Extended chapter on database architectures and the Web, covering cloud computing
• Updated chapter on professional, legal, and ethical issues in IT and databases
• New section on data warehousing and temporal databases
• New review questions and exercises at the end of chapters
• Updated treatment to cover the latest version of the SQL standard, which was
released in late 2011 (SQL:2011)
• Revised chapter on replication and mobile databases
• Updated chapters on Web-DBMS integration and XML
• Coverage updated to Oracle 11g.
Intended Audience
This book is intended as a textbook for a one- or two-semester course in database
management or database design in an introductory undergraduate, graduate, or
advanced undergraduate course Such courses are usually required in an
infor-mation systems, business IT, or computer science curriculum
The book is also intended as a reference book for IT professionals, such as tems analysts or designers, application programmers, systems programmers, da-
sys-tabase practitioners, and for independent self-teachers Owing to the widespread
use of database systems nowadays, these professionals could come from any type
of company that requires a database
It would be helpful for students to have a good background in the file organization and data structures concepts covered in Appendix F before
covering the material in Chapter 18 on physical database design and Chapter 23
on query processing This background ideally will have been obtained from
a prior course If this is not possible, then the material in Appendix F can be
presented near the beginning of the database course, immediately following
(2) An easy-to-use, step-by-step methodology for physical database design, covering the mapping of the logical design to a physical implementation, the selection
Trang 40of file organizations and indexes appropriate for the applications, and when to introduce controlled redundancy Again, there is an integrated case study show-ing how to use the methodology.
(3) Separate chapters showing how database design fits into the overall database systems development lifecycle, how fact-finding techniques can be used to identify the system requirements, and how UML fits into the methodology
(4) A clear and easy-to-understand presentation, with definitions clearly highlighted, chapter objectives clearly stated, and chapters summarized
Numerous examples and diagrams are provided throughout each chapter to illustrate the concepts There is a realistic case study integrated throughout the book and additional case studies that can be used as student projects
(5) Extensive treatment of the latest formal and de facto standards: Structured Query Language (SQL), Query-By-Example (QBE), and the Object Data Management Group (ODMG) standard for object-oriented databases
(6) Three tutorial-style chapters on the SQL standard, covering both interactive and embedded SQL
(7) A chapter on legal, professional and ethical issues related to IT and databases
(8) Comprehensive coverage of the concepts and issues relating to distributed DBMSs and replication servers
(9) Comprehensive introduction to the concepts and issues relating to object-based DBMSs including a review of the ODMG standard and a tutorial on the object management facilities within the latest release of the SQL standard, SQL:2011
(10) Extensive treatment of the Web as a platform for database applications with many code samples of accessing databases on the Web In particular, we cover persistence through Container-Managed Persistence (CMP), Java Data Ob-jects (JDO), Java Persistence API (JPA), JDBC, SQLJ, ActiveX Data Objects (ADO), ADO.NET, and Oracle PL/SQL Pages (PSP)
(11) An introduction to semistructured data and its relationship to XML and tensive coverage of XML and its related technologies In particular, we cover XML Schema, XQuery, and the XQuery Data Model and Formal Semantics
ex-We also cover the integration of XML into databases and examine the sions added to SQL:2008 and SQL:2011 to enable the publication of XML
exten-(12) Comprehensive introduction to data warehousing, Online Analytical ing (OLAP), and data mining
Process-(13) Comprehensive introduction to dimensionality modeling for designing a data warehouse database An integrated case study is used to demonstrate a meth-odology for data warehouse database design
(14) Coverage of DBMS system implementation concepts, including concurrency and recovery control, security, and query processing and query optimization
Pedagogy
Before starting to write any material for this book, one of the objectives was to produce a textbook that would be easy for the readers to follow and understand,