Contents Literature Summaries and Bibliography xviii Acknowledgments xix 1.1 Motivation—The Growth of Data and Increasing Relevance of Physical Database Design 2 1.3 Elements of Physical
Trang 2Physical Database Design
Trang 4The Database Professional’s Guide
to Exploiting Indexes, Views,
Storage, and More
Sam Lightstone Toby Teorey Tom Nadeau Physical Database Design
Trang 5Project Manager Marilyn E Rash
Assistant Editor Asma Palmeiro
Cover Image Nordic Photos
Composition: Multiscience Press, Inc.
Interior Printer Sheridan Books
Cover Printer Phoenix Color Corp.
Morgan Kaufmann Publishers is an imprint of Elsevier.
500 Sansome Street, Suite 400, San Francisco, CA 94111
This book is printed on acid-free paper.
Copyright © 2007 by Elsevier Inc All rights reserved.
Designations used by companies to distinguish their products are often claimed as trademarks or registered trademarks In all instances in which Morgan Kaufmann Publishers is aware of a claim, the product names appear in initial capital or all capital letters Readers, however, should contact the appropriate companies for more complete information regarding trademarks and registration.
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or
by any means—electronic, mechanical, photocopying, scanning, or otherwise—without prior written permission of the publisher.
Permissions may be sought directly from Elsevier’s Science & Technology Rights Department in Oxford, UK: phone: (+44) 1865 843830, fax: (+44) 1865 853333, E-mail: permissions@elsevier.com You may also complete your request on-line via the Elsevier homepage (http://elsevier.com), by selecting “Support & Contact” then “Copyright and Permission” and then “Obtaining Permissions.”
Page 197: “Make a new plan Stan and get yourself free.” - Paul Simon, copyright (c) Sony BMG/ Columbia Records All rights reserved Used with permission.
Library of Congress Cataloging-in-Publication Data
Lightstone, Sam.
Physical database design : the database professional’s guide to exploiting indexes,
views, storage, and more / Sam Lightstone, Toby Teorey, and Tom Nadeau.
p cm (The Morgan Kaufmann series in database management systems)
Includes bibliographical references and index.
ISBN-13: 978-0-12-369389-1 (alk paper)
ISBN-10: 0-12-369389-6 (alk paper)
1 Database design I Teorey, Toby J II Nadeau, Tom, 1958– III Title
Trang 6Contents
Literature Summaries and Bibliography xviii
Acknowledgments xix
1.1 Motivation—The Growth of Data and Increasing Relevance of Physical Database Design 2
1.3 Elements of Physical Design: Indexing,
Trang 72.2.1 Composite Index Approach 24
3.1 Query Processing and Optimization 323.2 Useful Optimization Features in Database Systems 323.2.1 Query Transformation or Rewrite 323.2.2 Query Execution Plan Viewing 33
4.1 Indexing Concepts and Terminology 53
4.1.2 Access Methods for Indexes 55
Trang 85.3 Exploiting Grouping and Generalization 84
5.5 Examples: The Good, the Bad, and the Ugly 89
6.4 Pros and Cons of Shared Nothing 103
6.6 Design Challenges: Skew and Join Collocation 108
6.8.2 Logical Nodes versus Physical Nodes 119
Trang 9Partitioning with Multidimensional Clustering 139
8.3 Not Just Query Performance: Designing
8.4 Examples of Queries Benefiting from MDC 153
9.1 Strong and Weak Dependency Analysis 168
9.3 Impact-first Waterfall Strategy 1719.4 Greedy Algorithm for Change Management 172
Trang 1010.1.4 Counting for Shared-nothing
10.2.1 The Benefits of Sampling with SQL 18410.2.2 Sampling for Database Design 185
11.1 Getting from Query Text to Result Set 19811.2 What Do Query Execution Plans Look Like? 201
11.4 Exploring Query Execution Plans to Improve
11.5 Query Execution Plan Indicators for Improved
11.6 Exploring without Changing the Database 21411.7 Forcing the Issue When the Query Optimizer
11.7.1 Three Essential Strategies 21511.7.2 Introduction to Query Hints 21611.7.3 Query Hints When the SQL Is Not Available
Trang 1112 Automated Physical Database Design 223
12.1 What-if Analysis, Indexes, and Beyond 22512.2 Automated Design Features from Oracle,
12.2.2 Microsoft SQL Server Database
12.2.3 Oracle SQL Access Advisor 23812.3 Data Sampling for Improved Statistics during Analysis 24012.4 Scalability and Workload Compression 24212.5 Design Exploration between Test
12.6 Experimental Results from Published Literature 248
12.9 Multidimensional Clustering Selection 25612.10 Shared-nothing Partitioning 258
13.1 What You Need to Know about CPU Architecture
13.3 Symmetric Multiprocessors and NUMA 27313.3.1 Symmetric Multiprocessors and NUMA 27313.3.2 Cache Coherence and False Sharing 274
Trang 1213.7.10 Which RAID Is Right for Your
13.8 Balancing Resources in a Database Server 28813.9 Strategies for Availability and Recovery 29013.10 Main Memory and Database Tuning 29513.10.1 Memory Tuning by Mere Mortals 29513.10.2 Automated Memory Tuning 29813.10.3 Cutting Edge: The Latest Strategy
in Self-tuning Memory Management 301
14.6 DSS, Warehousing, and OLAP Design Considerations 32814.7 Usage Syntax and Examples for Major Database Servers 32914.7.1 Oracle 33014.7.2 Microsoft’s Analysis Services 331
15.4.1 Requirements Specification 347
Trang 1315.4.2 Logical Design 34915.4.3 Schema Refinement Using Denormalizaton 350
A.1 I/O Time Cost—Individual Block Access 371A.2 I/O Time Cost—Table Scans and Sorts 372
with Oracle Data Guard for Database
Trang 14To my wife and children, Elisheva, Hodaya and Avishai
Trang 16Preface
ince the development of the relational model by E F Codd at IBM in 1970, tional databases have become the de facto standard for managing and queryingstructured data The rise of the Internet, online transaction processing, online banking,and the ability to connect heterogeneous systems have all contributed to the massivegrowth in data volumes over the past 15 years Terabyte-sized databases have becomecommonplace Concurrent with this data growth have come dramatic increases in CPUperformance spurred by Moore’s Law, and improvements in disk technology that havebrought about a dramatic increase in data density for disk storage Modern databasesfrequently need to support thousands if not tens of thousands of concurrent users Theperformance and maintainability of database systems depends dramatically on theirphysical design
rela-A wealth of technologies has been developed by leading database vendors allowingfor a fabulous range of physical design features and capabilities Modern databases cannow be sliced, diced, shuffled, and spun in a magnificent set of ways, both in memoryand on disk Until now, however, not much has been written on the topic of physicaldatabase design While it is true that there have been white papers and articles aboutindividual features and even individual products, relatively little has been written on thesubject as a whole Even less has been written to commiserate with database designersover the practical difficulties that the complexity of “creeping featurism” has imposed onthe industry This is all the more reason why a text on physical database design isurgently needed
We’ve designed this new book with a broad audience in mind, with both students
of database systems and industrial database professionals clearly within its scope In it
S
Trang 17we introduce the major concepts in physical database design, including indexes (B+,hash, bitmap), materialized views (deferred and immediate), range partitioning, hashpartitioning, shared-nothing design, multidimensional clustering, server topologies,data distribution, underlying physical subsystems (NUMA, SMP, MPP, SAN, NAS,RAID devices), and much more In keeping with our goal of writing a book that hadappeal to students and database professionals alike, we have tried to concentrate thefocus on practical issues and real-world solutions
In every market segment and in every usage of relational database systems thereseems to be nowhere that the problems of physical database design are not a critical con-cern: from online transaction processing (OLTP), to data mining (DM), to multi-dimensional online analytical processing (MOLAP), to enterprise resource planning(ERP), to management resource planning (MRP), and in both in-house enterprise sys-tems designed and managed by teams of database administrators (DBAs) and indeployed independent software vendor applications (ISVAs) We hope that the focus onphysical database design, usage examples, product-specific syntax, and best practice, willmake this book a very useful addition to the database literature
Organization
An overview of physical database design and where it fits into the database life cycleappears in Chapter 1 Chapter 2 presents the fundamentals of B+tree indexing, themost popular indexing method used in the database industry today Both simple index-ing and composite indexing variations are described, and simple performance measuresare used to help compare the different approaches Chapter 3 is devoted to the basics ofquery optimization and query execution plan selection from the viewpoint of what adatabase professional needs to know as background for database design
Chapters 4 through 8 discuss the individual important design decisions needed forphysical database design Chapter 4 goes into the details about how index selection isdone, and what alternative indexing strategies one has to choose from for both selectionand join operations Chapter 5 describes how one goes about choosing materializedviews for individual relational databases as well as setting up star schemas for collections
of databases in data warehouses The tradeoffs involved in materialized view selectionare illustrated with numerical examples Chapter 6 explains how to do shared-nothingpartitioning to divide and conquer large and computationally complex database prob-lems The relationship between shared-nothing partitioning, materialized view replica-tion, and indexing is presented
Chapter 7 is devoted to range partitioning, dividing a large table into multiplesmaller tables that hold a specific range of data, and the special indexing problems thatneed to be addressed Chapter 8 discusses the benefits of clustering data in general, andhow powerful this technique can be when extended to multidimensional data This
Trang 18Preface xvii
allows a system to cluster along multiple dimensions at the same time without ing data
duplicat-Chapter 9 discusses the problem of integrating the many physical design decisions
by exploring how each decision affects the others, and leads the designer into ways tooptimize the design over these many components Chapter 10 looks carefully at meth-ods of counting and sampling data that help improve the individual techniques of indexdesign, materialized view selection, clustering, and partitioning Chapter 11 goes morethoroughly into query execution plan selection by discussing tools that allow users tolook at the query execution plans and observe whether database decisions on designchoices, such as index selection and materialized views, are likely to be useful
Chapter 12 contains a detailed description of how many of the important physicaldesign decisions are automated by the major relational databases—DB2, SQL Server,and Oracle It discusses how to use these tools to design efficient databases morequickly Chapter 13 brings the database designer in touch with the many system issuesthey need to understand: multiprocessor servers, disk systems, network topologies,disaster recovery techniques, and memory management
Chapter 14 discusses how physical design is needed to support data warehousesand the OLAP techniques for efficient retrieval of information from them Chapter
15 defines what is meant by denormalization and illustrates the tradeoffs betweendegree of normalization and database performance Finally, Chapter 16 looks at thebasics of distributed data allocation strategies including the tradeoffs between the fastquery response times due to data replication and the time cost of updates of multiplecopies of data
Appendix A briefly describes a simple computational performance model used toevaluate and compare different physical design strategies on individual databases Themodel is used to clarify the tradeoff analysis and design decisions used in physicaldesign methods in several chapters Appendix B includes a comparison of two com-mercially available disaster-recovery technologies—IBM’s High Availability DisasterRecovery and Oracle’s Data Guard
Each chapter has a tips and insights section for the database professional that givesthe reader a useful summary of the design highlights of each chapter This is followed by
a literature summary for further investigation of selected topics on physical design bythe reader
Usage Examples
One of the major differences between logical and physical design is that with physicaldesign the underlying features and physical attributes of the database server (its softwareand its hardware) begin to matter much more While logical design can be performed inthe abstract, somewhat independent of the products and components that will be used
to materialize the design, the same cannot be said for physical design For this reason we
Trang 19have made a deliberate effort to include examples in this second book of the major
data-base server products in datadata-base server products about physical datadata-base design In thisset we include DB2 for zOS v8.1, DB2 9 (Linux, Unix, and Windows), Oracle 10g,SQL Server 2005, Informix Dataserver, and NCR Teradata We believe that this coversthe vast majority of industrial databases in use today Some popular databases are con-spicuously absent, such as MySQL and Sybase, which were excluded simply to constrainthe authoring effort
Literature Summaries and Bibliography
Following the style of the our earlier text on logical database design, Database Modeling and Design: Logical Design, Fourth Edition, each chapter concludes with a literature
summary These summaries include the major papers and references for the materialcovered in the chapter, specifically in two forms:
• Seminal papers that represent the original breakthrough thinking for the physical database design concepts discussed in the chapter
• Major papers on the latest research and breakthrough thinking
In addition to the chapter-centric literature summaries, a larger more comprehensivebibliography is included at the back of this book
Feedback and Errata
If you have comments, we would like to hear from you In particular, it’s very valuablefor us to get feedback on both changes that would improve the book as well as errors inthe current content To make this possible we’ve created an e-mail address to dialoguewith our readers: please write to us at db-design@rogers.com
Has everyone noticed that all the letters of the word database are typed with the left
hand? Now the layout of the QWERTY typewriter keyboard was designed among otherthings to facilitate the even use of both hands It follows, therefore, that among otherthings, writing about databases is not only unnatural, but a lot harder than it appears
—Anonymous
While this quip may appeal to the authors who had to personally suffer through
left-hand-only typing of the word database several hundred times in the authoring of
this book,1 if you substitute the words “writing about databases” with “designing
data-1 Confession of a bad typist: I use my right hand for the t and b This is an unorthodox but essary variation for people who need to type the word “database” dozens of times per day
Trang 20We also would like to thank the reviewers of this book who provided a number ofextremely valuable insights Their in-depth reviews and new directions helped us pro-duce a much better text Thank you to Mike Blaha, Philippe Bonnet, Philipe Carino,and Patrick O’Neil Thank you as well to the concept reviewers Bob Muller, DorianPyle, James Bean, Jim Gray, and Michael Blaha
We would like to thank our wives and children for their support and for allowing
us the time to work on this project, often into the wee hours of the morning
To the community of students and database designers worldwide, we salute you.Your job is far more challenging and complex than most people realize Each of the pos-sible design attributes in a modern relational database system is very complex in its ownright Tackling all of them, as real database designers must, is a remarkable challengethat by all accounts ought to be impossible for mortal human beings In fact, optimaldatabase design can be shown mathematically to truly be impossible for any moderatelyinvolved system In one analysis we found that the possible design choices for an averagedatabase far exceeded the current estimates of the number of atoms in the universe(1081) by several orders of magnitude! And yet, despite the massive complexity andsophistication of modern database systems, you have managed to study them, masterthem, and continue to design them The world’s data is literally in your hands We hopethis book will be a valuable tool for you By helping you, the students and designers ofdatabase systems, we hope this book will also lead in a small incremental but importantway to improvements in the world’s data management infrastructure
Engineering is a great profession There is the satisfaction of watching a figment of the
imagination emerge through the aid of science to a plan on paper Then it moves to
realization in stone or metal or energy Then it brings homes to men or women Then
Trang 21it elevates the standard of living and adds to the comforts of life This is the engineer’shigh privilege
—Herbert Hoover (1874–1964)
The most likely way for the world to be destroyed, most experts agree, is by accident
That’s where we come in; we’re computer professionals We cause accidents
—Nathaniel Borenstein (1957– )
Trang 22here was a great debate at the annual ACM SIGFIDET (now SIGMOD) meeting
in Ann Arbor, Michigan, in 1974 between Ted Codd, the creator of the relationaldatabase model, and Charlie Bachman, the technical creative mind behind the networkdatabase model and the subsequent CODASYL report The debate centered on whichlogical model was the best database model, and it had continued on in the academicjournals and trade magazines for almost 30 more years until Codd’s death in 2003.Since that original debate, many database systems have been built to support each ofthese models, and although the relational model eventually dominated the databaseindustry, the underlying physical database structures used by both types of systems wereactually evolving in sync Originally the main decision for physical design was the type
of indexing the system was able to do, with B+tree indexing eventually dominating thescene for almost all systems Later, other concepts like clustering and partitioningbecame important, but these methods were becoming less and less related to the logicalstructures being debated in the 1970s
Logical database design, that is, the design of basic data relationships and their inition in a particular database system, is largely the domain of application designersand programmers The work of these designers can effectively be done with tools, such
def-as ERwin Data Modeller or Rational Rose with UML, def-as well def-as with a purely manualapproach Physical database design, the creation of efficient data storage, and retrieval
T
Trang 23mechanisms on the computing platform you are using are typically the domain of thedatabase administrator (DBA), who has a variety of vendor-supplied tools availabletoday to help design the most efficient databases This book is devoted to the physicaldesign methodologies and tools most popular for relational databases today We useexamples from the most common systems—Oracle, DB2 (IBM), and SQL Server(Microsoft)—to illustrate the basic concepts.
1.1 Motivation—The Growth of Data and Increasing
Relevance of Physical Database Design
Does physical database design really matter? Absolutely Some computing professionalscurrently run their own consulting businesses doing little else than helping customersimprove their table indexing design Impressive as this is, what is equally astounding areclaims about improving the performance of problem queries by as much as 50 times.Physical database design is really motivated by data volume After all, a database with afew rows of data really has no issues with physical database design, and the performance
of applications that access a tiny database cannot be deeply affected by the physicaldesign of the underlying system In practical terms, index selection really does not mat-ter much for a database with 20 rows of data However, as data volumes rise, the physi-cal structures that underlie its access patterns become increasingly critical
A number of factors are spurring the dramatic growth of data in all three of its tured forms: structured (relational tuples), semistructured (e.g., XML), and unstruc-tured data (e.g., audio/video) Much of the growth can be attributed to the rapid expan-sion and ubiquitous use of networked computers and terminals in every home, business,and store in the industrialized world The data volumes are now taking a further leapforward with the rapid adoption of personal communication devices like cell phonesand PDAs, which are also networked and used to share data Databases measured in thetens of terabytes have now become commonplace in enterprise systems Following themapping of the human genome’s three billion chemical base pairs, pharmaceutical com-panies are now exploring genetic engineering research based on the networks of proteinsthat overlay the human genomes, resulting in data analysis on databases severalpetabytes in size (a petabyte is one thousand terabytes, or one million gigabytes) Table1.1 shows data from a 1999 survey performed by the University of California at Berke-ley You can see in this study that the data stored on magnetic disk is growing at a rate of100% per year for departmental and enterprise servers In fact nobody is sure exactlywhere the growth patterns will end, or if they ever will
cap-There’s something else special that has happened that’s driving up the data volumes
It happened so quietly that seemingly nobody bothered to mention it, but the change isquantitative and profound Around the year 2000 the price of storage dropped to apoint where it became cheaper to store data on computer disks than on paper (Figure
Trang 241.1 Motivation—The Growth of Data and Increasing Relevance of Physical Database Design 3
1.1) In fact this probably was a great turning point in the history of the development ofwestern civilization For over 2,000 years civilization has stored data in written text—onparchment, papyrus, or paper Suddenly and quietly that paradigm has begun to sunset.Now the digitization of text is not only of interest for sharing and analysis, but it is alsomore economical
The dramatic growth patterns change the amount of data that relational databasesystems must access and manipulate, but they do not change the speed at which opera-tions must complete In fact, to a large degree, the execution goals for data processingsystems are defined more by human qualities than by computers: the time a person iswilling to wait for a transaction to complete while standing at an automated bankingmachine or the number of available off-peak hours between closing time of a business inthe evening and the resumption of business in the morning These are constraints thatare defined largely by what humans expect and they are quite independent of the datavolumes being operated on While data volumes and analytic complexity are growingTable 1.1 Worldwide Production of Original Content, Stored Digitally, in Terabytes*
* Source: University of California at Berkeley study, 1999.
Trang 25rapidly, our expectations as humans are changing at a much slower rate Some relief isfound in the increasing power of modern data servers because as the data volumes grow,the computing power behind them is increasing as well However, the phenomenon ofincreasing processing power is mitigated by the need to consolidate server technology toreduce IT expenses, so as a result, as servers grow in processing power they are oftenused for an increasing number of purposes rather than being used to perform a singlepurpose faster
Although CPU power has been improving following Moore’s Law, doublingroughly every 18 months since the mid 1970s, disk speeds have been increasing at amore modest pace (see Chapter 13 for a more in-depth discussion of Moore’s Law).Finally, data is increasingly being used to detect “information” not just process “data,”and the rise of on-line analytical processing (OLAP) and data mining and other formsFigure 1.1 Storage price (Source: IBM Research.)
Trang 261.2 Database Life Cycle 5
of business intelligence computing has led to a dramatic increase in the complexity ofthe processing that is required
These factors motivate the need for complex and sophisticated approaches tophysical database design Why? By exploiting design techniques a practitioner canreduce the processing time for operations in some cases by several orders of magni-tude Improving computational efficiency by a thousand times is real, and valuable;and when you’re waiting at the bank machine to get your money, or waiting for ananalysis of business trading that will influence a multimillion dollar investment deci-sion, it’s downright necessary
1.2 Database Life Cycle
The database life cycle incorporates the basic steps involved in designing a logical base from conceptual modeling of user requirements through database managementsystem (DBMS) specific table definitions, and a physical database that is indexed, parti-tioned, clustered, and selectively materialized to maximize real performance For a dis-tributed database, physical design also involves allocating data across a computer net-work Once the design is completed, the life cycle continues with database implementa-tion and maintenance The database life cycle is shown in Figure 1.2 Physical databasedesign (step 3 below) is defined in the context of the entire database life cycle to showits relationship to the other design steps
data-1 Requirements analysis The database requirements are determined by interviewing
both the producers and users of data and producing a formal requirements fication That specification includes the data required for processing, the naturaldata relationships, and the software platform for the database implementation
speci-2 Logical database design Logical database design develops a conceptual model of
the database from a set of user requirements and refines that model into ized SQL tables The goal of logical design is to capture the reality of the user’sworld in terms of data elements and their relationships so that queries and
normal-updates to that data can be programmed easily The global schema, a conceptual
data model diagram that shows all the data and their relationships, is developedusing techniques such as entity-relationship (ER) modeling or the Unified Model-ing Language (UML) The data model constructs must ultimately be integratedinto a single global schema and then transformed into normalized SQL tables.Normalized tables (particularly third normal form or 3NF) are tables that aredecomposed or split into smaller tables to eliminate loss of data integrity due tocertain delete commands
We note here that some database tool vendors use the term logical model to refer to the conceptual data model, and they use the term physical model to refer
Trang 27Figure 1.2 Database life cycle.
Trang 281.3 Elements of Physical Design: Indexing, Partitioning, and Clustering 7
to the DBMS-specific implementation model (e.g., SQL tables) We also notethat many conceptual data models are obtained not from scratch, but from the
process of reverse engineering from an existing DBMS-specific schema
[Silber-schatz 2006] Our definition of the physical model is given below
3 Physical database design The physical database design step involves the selection
of indexes, partitioning, clustering, and selective materialization of data Physicaldatabase design (as treated in this book) begins after the SQL tables have beendefined and normalized It focuses on the methods of storing and accessing thosetables on disk that enable the database to operate with high efficiency The goal
of physical design is to maximize the performance of the database across theentire spectrum of applications written on it The physical resources that involvetime delays in executing database applications include the CPU, I/O (e.g.,disks), and computer networks Performance is measured by the time delays toanswer a query or complete an update for an individual application, and also bythe throughput (in transactions per second) for the entire database system overthe full set of applications in a specified unit of time
4 Database implementation, monitoring, and modification Once the logical and
physical design is completed, the database can be created through tion of the formal schema using the data definition language (DDL) of aDBMS Then the data manipulation language (DML) can be used to query andupdate the database, as well as to set up indexes and establish constraints such asreferential integrity The language SQL contains both DDL and DML con-structs; for example, the “create table” command represents DDL, and the
implementa-“select” command represents DML
As the database begins operation, monitoring indicates whether performancerequirements are being met If they are not being satisfied, modifications should bemade to improve performance Other modifications may be necessary when require-ments change or end-user expectations increase with good performance Thus, the lifecycle continues with monitoring, redesign, and modifications
1.3 Elements of Physical Design: Indexing,
Partitioning, and Clustering
The physical design of a database starts with the schema definition of logical records
produced by the logical design phase A logical record (or record) is a named collection of
data items or attributes treated as a unit by an application program In storage, a recordincludes the pointers and record overhead needed for identification and processing by
the database management system A file is typically a set of similarly constructed records
of one type, and relational tables are typically stored as files A physical database is a
Trang 29col-lection of interrelated records of different types, possibly including a colcol-lection of related files Query and update transactions to a database are made efficient by theimplementation of certain search methods as part of the database management system.
inter-1.3.1 Indexes
An index is a data organization set up to speed up the retrieval (query) of data from
tables In database management systems, indexes can be specified by database tion programmers using the following SQL commands:
applica-CREATE UNIQUE INDEX supplierNum ON supplier(snum); /*unique index on a key*/
A unique index is a data structure (table) whose entries (records) consist of attributevalue, pointer pairs such that each pointer contains the block address of an actual data-base record that has the associated attribute value as an index key value This is known
as an ordered index because the attribute (key) values in the index are ordered as ASCIIvalues If all the key values are letters, then the ordering is strictly alphabetical Orderedindexes are typically stored as B+trees so that the search for the matching key value isfast Once the key value and corresponding data block pointer are found, there is onemore step to access the block containing the record you want, and a quick search inmemory of that block to find the record
Sometimes data is better accessed by an attribute other than a key, an attribute thattypically has the same value appear in many records In a unique index based on a key,the key has a unique value in every record For a nonunique attribute, an index musthave multiple attribute, pointer pairs for the same attribute value, and each pointer hasthe block address of a record that has one of those attribute values In the B+tree index,the leaf nodes contain these attribute, pointer pairs that must be searched to find therecords that match the attribute value The SQL command for this kind of index, alsocalled a secondary index, is:
CREATE INDEX shippingDate ON shipment (shipdate);
/*secondary index on non-key*/
In a variation of the secondary or nonunique index, it is possible to set up a tion of attribute values that you want to use to query a table Each entry in the indexconsists of a set of attribute values and a block pointer to the record that contains exactmatches for all those attribute values in the set An example of an SQL command to set
collec-up this kind of index is:
CREATE INDEX shipPart ON shipment (pnum, shipdate); /*secondary concatenated index*/
Trang 301.3 Elements of Physical Design: Indexing, Partitioning, and Clustering 9
This kind of index is extremely efficient for queries involving both a part number(pnum) and shipping date (shipdate) For queries involving just one of these attributes,
it is less efficient because of its greater size and therefore longer search time
When we want to improve the query time for a table of data, say for instance thetable we access via the nonunique index on ship dates, we could organize the database sothat equivalent ship dates are stored near each other (on disk), and ship dates that are
close to each other in value are stored near each other This type index is called a tered index Otherwise the index is known as a nonclustered index There can only be one
clus-clustered index per table because the physical organization of the table must be fixed
When the physical database table is unordered, it can be organized for efficient
access using a hash table index often simply known as a hash index This type of index is
most frequently based on a key that has unique values in the data records The attribute(key) value is passed through a function that maps it to a starting block address, known
as a bucket address The table must be set up by inserting all the records according to the
hash function, and then using the same hash function to query the records later
Another variation of indexing, a bitmap index, is commonly used for secondary
indexing with multiple attribute values, and for very large databases in data warehouses
A bitmap index consists of a collection of bit vectors, with each bit vector corresponding
to a particular attribute value, and for each record in the table, the bit vector is a “1” ifthat record has the designated bit vector value, and “0” if it does not This is particularlyuseful if an attribute is sparse, that is, it has very few possible values, like gender orcourse grade It would not work well for attributes like last name, job title, age, and so
on Bit vectors can be stored and accessed very efficiently, especially if they are smallenough to be located in memory
The analysis and design of indexes are discussed in detail in Chapters 2 and 4
Trang 31ware-be taken into account in their design and usage This is discussed in more detail inChapter 5.
1.3.3 Partitioning and Multidimensional Clustering
Partitioning in physical database design is a method for reducing the workload on anyone hardware component, like an individual disk, by partitioning (dividing) the dataover several disks This has the effect of balancing the workload across the system andpreventing bottlenecks In range partitioning, the data attribute values are sorted andranges of values are selected so that each range has roughly the same number ofrecords Records in a given range are allocated to a specific disk so it can be processedindependently of other ranges The details of partitioning across disks are discussed inChapter 7
Multidimensional clustering (MDC) is a technique by which data can be clustered
by dimensions, such as location, timeframe, or product type In particular, MDCallows data to be clustered by many dimensions at the same time, such as ice skatessold in Wisconsin during the month of December The clusters are meant to takeadvantage of known and anticipated workloads on this data MDC is developed indetail in Chapter 8
1.3.4 Other Methods for Physical Database Design
There are many other ways to make data access more efficient in a database For
instance, data compression is a technique that allows more data to fit into a fixed amount
of space (on disk) and therefore accessed faster if data needs to be scanned a lot Theoverhead for compression is in the algorithm to transform the original data into thecompressed form for storage, and then to transform the compressed form back to theoriginal for display purposes
Data striping, or just striping, is a technique for distributing data that needs to be
accessed together across multiple disks to achieve a greater degree of parallelism andload balancing, both of which makes system throughput increase and generally lowersquery times This is particularly suited to disk array architectures like RAID (redundantarrays of independent disks) where data can be accessed in parallel across multiple disks
in an organized way
Another way to improve database reliability includes data redundancy techniques
like mirroring, in which data is duplicated on multiple disks The downside of
redun-dancy is having to update multiple copies of data each time a change is required in thedatabase, as well as the extra storage space required Storage space is getting cheaperevery day, but time is not On the other hand, data that is never or infrequently updatedmay lend itself nicely to be stored redundantly
Trang 321.4 Why Physical Design Is Hard 11
As part of the physical design, the global schema can sometimes be refined in limitedways to reflect processing (query and transaction) requirements if there are obvious large
gains to be made in efficiency This is called denormalization It consists of selecting
dom-inant processes on the basis of high frequency, high volume, or explicit priority; definingsimple extensions to tables that will improve query performance; evaluating total cost forquery, update, and storage; and considering the side effects, such as possible loss of integ-rity Details are given in Chapter 15
1.4 Why Physical Design Is Hard
Physical database design involves dozens and often hundreds of variables, which are ficult to keep track of, especially when their effects are very often interrelated to the var-ious design solutions proposed The individual computations of performance based on
dif-a given index mechdif-anism or pdif-artition dif-algorithm mdif-ay tdif-ake severdif-al hours by hdif-and, dif-andperformance analysis is often based on the comparison of many different configurationsand load conditions, thus requiring thousands of computations This has given rise toautomated tools such as IBM’s DB2 Design Advisor, Oracle’s SQL Access Advisor, Ora-cle’s SQL Tuning Advisor, and Microsoft’s Database Tuning Advisor (DTA), formerlyknown as the Index Tuning Wizard These tools make database tuning and performanceanalysis manageable, allowing the analyst to focus on solutions and tradeoffs while tak-ing care of the myriad of computations that are needed We will look at both manualanalysis and automatic design tools for physical database design in this book
TIPS AND INSIGHTS FOR DATABASE PROFESSIONALS
• Tip 1 The truth is out there, but you may not need it Every database has a
theoreti-caly perfect, or "optimal", physical design In reality almost nobody ever finds it
because the search complexity is too high and the validation process too cumbersome
Database design is really hard problem However, the complexity is mitigated by the
practical fact that at the end of the day what matters most is not whether the database
performance is as good as it can theoretically be, but whether the applications that use
the database perform "good enough" so that their users are satisfied Good enough is a
vague and subjective definition of course In most cases, while the perfect database
design is usually elusive, one that performs more than 85% of optimal can be achieved
by mere mortals
Trang 331.5 Literature Summary
Database system and design textbooks and practitioners’ guides that give seriousattention to the principles of physical database design include Burleson [2005],Elmasri and Navathe [2003], Garcie-Molina, Ullman, and Widom [2000, 2001],Ramakrishnan and Gehrke [2004], Shasha and Bonnet [2003], and Silberschatz,Korth, and Sudarshan [2006]
Knowledge of logical data modeling and physical database design techniques isimportant for database practitioners and application developers The database life cycleshows what steps are needed in a methodical approach to database design from logicaldesign, which is independent of the system environment, to physical design, which isbased on maximizing the performance of the database under various workloads
Agarwal, S., Chaudhuri, S., Kollar, L., Maranthe, A., Narasayya, V., and Syamala, M
Database Tuning Advisor for Microsoft SQL Server 2005 30th Very Large Database Conference (VLDB), Toronto, Canada, 2004
Burleson, D Physical Database Design Using Oracle Boca Raton, FL: Auerbach
Publish-ers, 2005
Elmasri, R., and Navathe, S B Fundamentals of Database Systems Boston:
Addison-Wesley, 4th ed Redwood City, CA, 2004
Garcia-Molina, H., Ullman, J., and Widom, J Database System Implementation
Engle-wood Cliffs, NJ: Prentice-Hall, 2000
Garcia-Molina, H., Ullman, J., and Widom, J Database Systems: The Complete Book.
Englewood Cliffs, NJ: Prentice-Hall, 2001
• Tip 2 Be prepared to tinker The possibilities are endless, and you will never be able to
explore them all But with some wisdom and insight you and some playing aroundwith possibilities you can go far Trial and error is part of the process
• Tip 3 Use the tools at your disposal Throughout this book we will describe various
techniques and methods for physcal database design Many database design perform anorder of magnitude worse than they could simply because the designer didn’t bother touse the techniques available Database designs does not begin and end with simple sin-gle column index selection By exploiting features like memory tuning, materializedviews, range partitioning, multidimensional clustering, clustering indexes, or sharednothing partitioning you can dramatically improve on a basic database design, espe-cially for complex query processing
Trang 362
Basic Indexing Methods
If you don’t find it in the index, look very carefully
through the entire catalogue
—Sears, Roebuck, and Co., Consumer’s Guide, 1897
he concept of indexing for dictionaries, encyclopedias, book manuscripts, catalogs,and address lists has been around for centuries Methods like tabs in a book volume
or lists of index terms at the back of the book are used very effectively When a puter needs to find data within a relational database it has exactly the same need Howcan a small amout of data be found from within a very large set without “looking verycarefully through the entire thing catalog.” For example, consider how inefficient lifewould be if every time you walked over to an ATM machine to withdraw money thebank’s computer performed a linear search through possibly millions of customerrecords until it found the entry matching your bank account number Large electronicfiles and databases have accelerated the need for indexing methods that can access a sub-set of the data quickly In some cases the data is very small and can be accessed in a sin-gle input/output (I/O) operation
com-In other cases data needs to be accessed in bulk and looked over by an analyst todetermine which data is relevant to the current need In the late 1970s, indexing for theearliest relational, hierarchical (like IMS), and CODASYL databases was done in a widevariety of ways: sequential, indexed sequential, hashing, binary search trees, B-trees,TRIE structures, multilist files, inverted files, and doubly chained trees to name a few.Many options were allowed for programmers to choose from, and the choice of the best
T
Trang 37method for a particular database was a complex decision, requiring considerable formaltraining For other types of databases—object-oriented, spatial, temporal, and so on—the list of potential indexing strategies (R-trees for instance) goes on and will continue
to evolve for the foreseeable future
Fortunately, for relational databases, the B+tree has properties that span virtually all
of the methods mentioned above and has become the de facto indexing method for themajor relational database systems today This chapter discusses the basic principles ofindexing used in today’s database systems to enhance performance
2.1 B+tree Index
The B+tree is the primary indexing method supported by DB2, Oracle, and SQL Server
It features not only fast access, but also the dynamic maintenance that virtually eliminatesthe overflow problems that occur in the older hashing and indexed sequential methods.Let’s take a look at a typical B+tree configuration in Figure 2.1 Each nonleaf index node
consists of p tree pointers and p-1 key values The key values denote where to search to
find rows that have either smaller key values, by taking the tree pointer to the left of thekey, or great or equal key values by taking the tree pointer to the right of the key Each leafindex node consists of a series of key and data-pointer combinations that point to eachrow The leaf index nodes (and the associated data blocks) are connected logically by block
pointers so that an ordered sequence of rows can be found quickly The variable p sents the order of the B+tree, the fan-out of pointers from one node to the next lower node
repre-in the tree
The height of a B+tree is the level of the leaf nodes, given that the root node is
defined as Level 1 In Figure 2.1, the order of the B+tree is four and the height is three.The intermmediate nodes help the database find the desired leaf nodes with veryfew I/Os However, what’s stored within the leaf nodes is the real meat The leaf nodeswill store three very important things: the key, a record identifier (RID) and a pointer tothe next leaf For example, if the index is defined on a CITY column, the keys of theindex may include NEW YORK, BOSTON, TORONTO, SAN JOSE For each citythe index will include an identifier of where to find each record in the table thatmatches the key Consider a search through the index where the key value is 'NEWYORK' If the table has 400 entries for NEW YORK, the index leaf node will includeone key entry for NEW YORK and identifiers for where to find each of the 400 records
in the base table These identifier are called record identifiers, or RIDs, and will be cussed next The key and the RIDs are the two most critical parts of the leaf content Keys are stored within the leaf pages in sorted order As a result while B+treeindexes can be used by the DBMS to find a row or set or rows that match a single key,they can also be used to easily find a range of keys (i.e., a numeric range or alphabeticrange) as well as implicitly return records in sorted order
Trang 38dis-2.1 B+tree Index 17
Example: Determine the Order and Height of a B+tree
To determine the order of a B+tree, let us assume that the database has 500,000 rows of
200 bytes each, the search key is 15 bytes, the tree and data pointers are 5 bytes, and theindex node (and data block size) is 1,024 bytes For this configuration we have
Nonleaf index node size = 1,024 bytes = p × 5 + (p – 1) × 15 bytes
Solving for the order of the B+tree, p, we get:
p = floor((1,024 + 15)/20) = floor(51.95) = 51, 2.1
where the floor function is the next lower whole number, found by truncating the actual
value to the next lower integer Therefore, we can have up to p – 1 or 50 search key
val-ues in each nonleaf index node In the leaf index nodes there are 15 bytes for the searchkey value and 5 bytes for the data pointer Each leaf index node has a single pointer tothe next leaf index node to make a scan of the data rows possible without going throughthe index-level nodes In this example the number of search key values in the leaf nodes
is floor ((1,024 – 5)/(15 + 5)) = 50, which is the same number of search key values inthe nonleaf nodes as well
The height h of a B+tree is the number of index levels, including the leaf nodes It is computed by noting that the root index node (ith level) has p pointers, the i-1st level has p2 tree pointers, i-2nd level has p3 tree pointers, and so on At the leaf level the
number of key entries and pointers is p – 1 per index node, but a good approximation can be made by assuming that the leaf index nodes are implemented with p pointers and
Figure 2.1 B+tree configuration (order four, height three)
Trang 39p key values The total number of pointers over all nodes at that level (h) must be greater than or equal to the number of rows in the table (n) Therefore, our estimate becomes
A query to a particular row in a B+tree is simply the time required to access all h
levels of the tree index plus the access to the data row (within a block or page) Allaccesses to different levels of index and data are assumed to be random
Read a single row in a table (using a B+tree)
= h + 1 block accesses. 2.3Updates of rows in a B+tree can be accomplished with a simple query and rewriteunless the update involves an insertion that overflows a data or index node or a deletionthat empties a data or index node A rewrite of a row just read tends to be a randomblock access in a shared disk environment, which we assume here For the simple case ofupdating data values in a particular row, assume that each index node is implemented as
a block
Update cost for a single row (B+tree)
= search cost + rewrite data block
= (h + 1) + 1
= h + 2 block accesses, 2.4where a block access is assumed to be an access to an individual block (or page) in ashared disk environment
If the operation desired is an insertion and the insertion causes overflow of a data orleaf index node, additional accesses are needed to split the saturated node into twonodes that are half filled (using the basic splitting algorithm) plus the need to rewritethe next higher index node with a new pointer to the new index node (see Figure 2.2).The need for a split is recognized after the initial search for the row has been done Asplit of an leaf index node requires a rewrite of the saturated leaf index node, half filledwith data, plus a write of a new leaf index node also half filled, plus a rewrite of the non-leaf index node with a new pointer value to the new leaf index node
Trang 402.1 B+tree Index 19
Insert cost for a single row (B+tree)
= search cost + rewrite data block + rewrite index block
= (h + 1) + 1 + 1
= h + 3 block accesses, 2.5plus additional writes, if necessary, for splitting
Occasionally, the split operation of an leaf index node necessitates a split of the nexthigher index node as well, and in the worst case the split operations may cascade all theway up to the index root node The probability of additional splits depends on the type
of splitting algorithm and the dynamics of insertions and deletions in the workload.Database statistics can be collected to determine these probabilities
Figure 2.2 Dynamic maintenance for a B+tree row insertion