DATA MODELING FUNDAMENTALSA Practical Guide for IT Professionals Paulraj Ponniah... DATA MODELING FUNDAMENTALSA Practical Guide for IT Professionals Paulraj Ponniah... Other Modeling Tre
Trang 2DATA MODELING FUNDAMENTALS
A Practical Guide for
IT Professionals
Paulraj Ponniah
Trang 4DATA MODELING FUNDAMENTALS
Trang 6DATA MODELING FUNDAMENTALS
A Practical Guide for
IT Professionals
Paulraj Ponniah
Trang 7Copyright # 2007 by John Wiley & Sons, Inc All rights reserved.
Published by John Wiley & Sons, Inc., Hoboken, New Jersey
Published simultaneously in Canada.
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers,
MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com.
Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011,
fax (201) 748-6008, or online at http: //www.wiley.com/go/permission.
Limit of Liability /Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to
the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose No warranty may be created
or extended by sales representatives or written sales materials The advice and strategies contained herein may not be suitable for your situation You should consult with a professional where appropriate Neither the publisher nor author shall be liable for any loss of profit or any
other commercial damages, including but not limited to special, incidental, consequential,
or other damages.
For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974,
outside the United States at (317) 572-3993 or fax (317) 572-4002.
Wiley also publishes its books in a variety of electronic formats Some content that appears in print may not be available in electronic formats For more information about Wiley products, visit our web site at www.wiley.com.
Library of Congress Cataloging-in-Publication Data
10 9 8 7 6 5 4 3 2 1
Trang 8ToDaniel Arjun, my dear son-in-law
and toReisha and Shoba, my dear daughters-in-law
Trang 10I INTRODUCTION TO DATA MODELING 1
Chapter Objectives / 3
Data Model Defined / 4
What Is a Data Model? / 5
Why Data Modeling? / 6
Who Performs Data Modeling? / 9
Information Levels / 10
Classification of Information Levels / 11
Data Models at Information Levels / 13
Conceptual Data Modeling / 17
Data Model Components / 18
Data Modeling Steps / 20
Data Model Quality / 26
Significance of Data Model Quality / 27
Data Model Characteristics / 27
Ensuring Data Model Quality / 28
Data System Development / 29
Data System Development Life Cycle / 29
Roles and Responsibilities / 33
Modeling the Information Requirements / 33
Applying Agile Modeling Principles / 34
Data Modeling Approaches and Trends / 35
Data Modeling Approaches / 36
Modeling for Data Warehouse / 38
vii
Trang 11Other Modeling Trends / 39
Methods and Techniques / 47
Peter Chen (E-R) Modeling / 48
Information Engineering / 50
Integration Definition for Information Modeling / 51
Richard Barker’s Model / 53
Object-Role Modeling / 55
eXtensible Markup Language / 57
Summary and Comments / 60
Unified Modeling Language / 61
Data Modeling Using UML / 61
UML in the Development Process / 64
Chapter Summary / 68
Review Questions / 68
II DATA MODELING FUNDAMENTALS 71
Chapter Objectives / 73
Data Model Composition / 74
Models at Different Levels / 74
Conceptual Model: Review Procedure / 76
Conceptual Model: Identifying Components / 77
Trang 12Identifiers / 101
Review of the Model Diagram / 103
Logical Model: Overview / 104
Identifying Entity Types / 120
Homonyms and Synonyms / 125
Category of Entity Types / 127
Exploring Dependencies / 130
Dependent or Weak Entity Types / 131
Classifying Dependencies / 132
Representation in the Model / 133
Generalization and Specialization / 134
Why Generalize or Specialize? / 136
Supertypes and Subtypes / 137
Entity Type Versus Attribute / 148
Entity Type Versus Relationship / 148
Modeling Time Dimension / 149
Trang 135 Attributes and Identifiers in Detail 157Chapter Objectives / 157
Resolution of Mixed Domains / 168
Constraints for Attributes / 169
Single-Valued and Multivalued Attributes / 171
Simple and Composite Attributes / 171
Attributes with Stored and Derived Values / 172
Optional Attributes / 173
Identifiers or Keys / 175
Need for Identifiers / 175
Definitions of Keys / 175
Guidelines for Identifiers / 176
Key in Generalization Hierarchy / 177
Attribute Validation Checklist / 178
Trang 14Maximum and Minimum Cardinalities / 204
Mandatory Conditions: Both Ends / 206
Optional Condition: One End / 206
Optional Condition: Other End / 207
Optional Conditions: Both Ends / 208
Relationship or Entity Type? / 215
Ternary Relationship or Aggregation? / 216
Binary or N-ary Relationship? / 216
III DATA MODEL IMPLEMENTATION 227
Chapter Objectives / 229
Relational Model: Fundamentals / 231
Basic Concepts / 231
Structure and Components / 233
Data Integrity Constraints / 238
Transition to Database Design / 242
Design Approaches / 243
CONTENTS xi
Trang 15Conceptual to Relational Model / 243
Traditional Method / 244
Evaluation of Design Methods / 245
Model Transformation Method / 246
Strengths of the Method / 277
Application of the Method / 277
Normalization Steps / 277
Fundamental Normal Forms / 278
First Normal Form / 278
Second Normal Form / 279
Third Normal Form / 281
Boyce-Codd Normal Form / 284
Higher Normal Forms / 285
Fourth Normal Form / 286
Fifth Normal Form / 287
Domain-Key Normal Form / 288
Trang 16History of Decision-Support Systems / 297
Operational Versus Informational Systems / 299
System Types and Modeling Methods / 299
Data Warehouse / 301
Data Warehouse Defined / 301
Major Components / 302
Data Warehousing Applications / 305
Modeling: Special Requirements / 305
OLAP Implementation Approaches / 330
Data Modeling for OLAP / 332
Data Mining Systems / 334
Basic Concepts / 334
Data Mining Techniques / 338
Data Preparation and Modeling / 339
Data Preprocessing / 339
Data Modeling / 341
Chapter Summary / 342
Review Questions / 343
IV PRACTICAL APPROACH TO DATA MODELING 345
Chapter Objectives / 347
Significance of Quality / 348
Why Emphasize Quality? / 348
Good and Bad Models / 349
Approach to Good Modeling / 351
Trang 17Checklists / 358
High-Quality Data Model / 360
Meaning of Data Model Quality / 360
Quality Dimensions / 361
What Is a High-Quality Model? / 363
Benefits of High-Quality Models / 364
Quality Assurance Process / 365
Aspects of Quality Assurance / 365
Stages of Quality Assurance Process / 366
Data Model Review / 369
Data Model Assessment / 370
Chapter Summary / 373
Review Questions / 373
Chapter Objectives / 375
The Agile Movement / 376
How It Got Started / 377
Principles of Agile Development / 378
Need for Flexibility / 386
Nature of Evolutionary Modeling / 386
Trang 18Requirements: Model Interface / 400
Integration of Partial Models / 401
Conceptual Model Layout / 409
Readability and Usability / 409
Trang 20Do you want to build a hybrid automobile? First, you need to create a model of the car Doyou want to build a mansion? First, you need to have blueprints and create a model of thedwelling Do you want to build a spaceship? First, you need to design a miniature model ofthe vehicle Do you want to implement a database for your organization? First, you need tocreate a data model of the information requirements
Without a proper data model of the information requirements of an enterprise, an quate database system cannot be correctly designed and implemented for the organization
ade-A good data model of high quality forms an essential prerequisite for any successful base system Unless the data modelers represent the information requirements of theorganization in a proper data model, the database design will be totally ineffective.The theme of this book is to present the fundamentals and ideas and practices aboutcreating good and useful data models—data models that can function effectively astools of communication with the user community and as database blueprints for databasepractitioners
data-THE NEED
In every industry across the board, from retail chain stores to financial institutions, frommanufacturing enterprises to government agencies, and from airline companies to utilitybusinesses, database systems have become the norm for information storage and retrieval.Whether it is a Web-based application driving electronic commerce or an inventorycontrol application managing just-in-time inventory or a data warehouse system support-ing strategic decision making, you need an effective technology to store, retrieve, and usedata in order to make the application successful It is no wonder that institutions haveadopted database technology without any reservations
In this scenario, the information technology (IT) department of every organization has aprimary responsibility to design and implement database systems and keep them running.One set of special skills for accomplishing this relates to data modeling Informationtechnology professionals with data modeling skills constitute a significant group Infor-mation technology professionals specializing in data modeling must be experts with athorough knowledge of data modeling fundamentals They must be well versed in themethodologies, techniques, and practices of data modeling
xvii
Trang 21ADDRESSING THE NEED
How can IT professionals desirous of acquiring data modeling skills learn the requiredtechniques and gain proficiency in data modeling? Many seminar companies, colleges,and other teaching institutions offer courses in database design and development.However, such courses do not have data modeling as a primary focus Very fewcourses, if any, concentrate just on data modeling So, eager IT professionals are leftwith the choice of learning data modeling and gaining expert knowledge from booksexclusively on this subject How many such books should they read to learn the principlesand concepts?
This book intends to be the one definitive publication to fulfill the needs of aspiring datamodelers, of those experienced data modelers desiring to have a refresher, and even ofexpert data modelers wishing to review additional concepts In this volume, I haveattempted to present my knowledge and insights acquired through three decades of IT con-sulting, through many years of teaching data-related subjects in seminar and collegeenvironments, and through graduate and postgraduate levels of studies I do hope thisexperience will be of use to you
WHAT THIS BOOK CAN DO FOR YOU
Are you a novice data modeler? Are you fairly new to data modeling but aspire to pick upthe necessary skills? Alternatively, are you a practicing data modeler with experience inthe discipline? Are you a generalizing specialist, meaning that you want to add data mod-eling as another skill to your arsenal of IT proficiency? Irrespective of the level of yourinterest in data modeling, this is the one book that is specially designed to cover all theessentials of data modeling in a manner exactly suitable for IT professionals The booktakes a practical approach in presenting the underlying principles and fundamentals, aug-menting the presentation with numerous examples from the real world
The book begins in Part I with a broad overview of data modeling In Chapter 1, you areintroduced to all the essential concepts Before proceeding into further details, you need tofamiliarize yourself with the data modeling techniques Chapter 2 explores the leadingtechniques—the approaches, the symbols, the syntax, the semantics, and so on
Part II of the book presents the fundamentals in great detail It does not matter whatyour knowledge level of data modeling is You will find this part interesting and useful.You are presented with a real-world case study with a completed data model You areasked to study the anatomy of the data model and understand how the actual design andcreation of the data model works Part II also digs deeper into individual components of
a data model with several real-world examples
In Part III, you will learn the transition from data model to database design In recenttimes, decision-support systems have come to the forefront of computing Part IIIdescribes decision-support systems such as data warehousing and data mining andguides you through data modeling methods for these systems This is essential knowledgefor modern data modelers
In Part IV of the book, you will find a chapter exclusively devoted to quality in the datamodel Every data modeler aspires to create a model of the highest quality This chapter isrequired reading A new wave known as agile software development is on the rise
xviii PREFACE
Trang 22producing great benefits You will learn about this movement and gain insights into agiledata modeling—its principles and practices.
Finally, are you looking for practical suggestions on data modeling distilled from years
of experience of many practitioners? If so, the final chapter is for you The book aptlyconcludes with such a chapter filled with numerous practical tips and suggestions
PAULRAJPONNIAH
Milltown, New Jersey
April 2007
PREFACE xix
Trang 24I must also record my gratitude to the several professional colleagues who had workedwith me on various data modeling and database projects during my long IT consultingcareer Also, thanks are due to the many students in my data modeling and databaseclasses over the years Interactions with my colleagues and students have shaped thisbook in a format especially suitable for the needs of IT professionals.
xxi
Trang 26INTRODUCTION TO DATA MODELING
1
Trang 28DATA MODELING:
AN OVERVIEW
CHAPTER OBJECTIVES
Introduce the process of data modeling
Present why data modeling is important
Explain how a data model represents information requirements
Describe conceptual, logical, and physical data models
Briefly discuss the steps for building a data model
Show the role of data modeling in system development
Provide an initial glimpse of data modeling history and trends
James Watson and Francis Crick, working at Cambridge University, deduced thethree-dimensional structure of DNA (deoxyribonucleic acid) In 1953, they published abrief paper describing their now-famous double helix model of DNA This important mile-stone of creating a true model of DNA gave a tremendous boost to biology and genetics.For the discovery and creation of the double helix model, Watson and Crick shared theNobel Prize for Physiology and Medicine in 1962
Well, what does Watson and Crick’s achievement have to do with our current study?Essentially, they built a model The model is a true representation of the structure ofDNA—something we find in the real world Models are replicas or representations of par-ticular aspects and segments of the real world Building of models is quite common inmany disciplines When you think about it, the representation “5 þ 4¼ 9” is a mathe-matical model using symbols and logic This model represents the fact that if you putfive things together with four things of the same kind, you get nine things of the samekind In physics, we create models to represent physical properties of the world In econ-omics, we create models of economic trends and forecast economic outcomes
3
Data Modeling Fundamentals By Paulraj Ponniah
Copyright # 2007 John Wiley & Sons, Inc.