The book covers: • The most common factors for ensuring data warehousing success and the roadblocks that can prevent it • How to ensure that business and technical staff have a common u
Trang 1T IMELY P RACTICAL R ELIABLE.
Laura L Reeves
A Manager’s Guide to
Data Warehousing
Wiley Computer Publishing Timely Practical Reliable.
An ideal guide for the non-technical professional eager to learn more
about data warehousing
each step of a data warehouse project, and provides a clear explanation of
what’s involved in efficiently building
a data warehouse and what must be done to deliver the data You’ll examine
the business management of a data warehouse and discover essential
methods for cultivating a strong partnership between the business and IT
elements of your organization You can use this knowledge to be more effective
when sharing your requirements and concerns during a project
A Manager’s Guide to Data Warehousing
explains what you need to create your data warehouse and establish long-term
success The book covers:
• The most common factors for ensuring data warehousing
success and the roadblocks that can prevent it
• How to ensure that business and technical staff have a common
understanding of the data warehouse project
Database/Data Warehousing
LAURA L REEVES, coauthor of The
Data Warehouse Lifecycle Toolkit,
has over 23 years of experience
in end-to-end data warehouse development focused on developing
comprehensive project plans, collecting business requirements,
designing business dimensional models and database schemas, and
creating enterprise data warehouse strategies and data architectures.
A successful data warehouse project
can provide immense value for business
enterprises or other organizations
Building and maintaining a data
warehouse demands the combined
efforts of both IT and non-technical
personnel While there are plenty of
resources aimed at the technology
professionals who design and build data
warehouses, there has to date been no
useful guide written for a non-technical
audience This book fills that void and
serves as an ideal resource for business
and IT managers and others from the
non-IT side who want to do their part to
ensure data warehousing success.
This helpful book provides a solid
introduction to the fundamentals of
data warehousing The author details
the data warehouse
• The tools you need to make certain that data is organized
and can be delivered as needed
• Ways to deploy the data warehouse and ensure
sustainable success
Reeves
ISBN: 978-0-470-17638-2
Trang 3Data Warehousing
Trang 6Wiley Publishing, Inc.
10475 Crosspoint Boulevard
Indianapolis, IN 46256
www.wiley.com
Published simultaneously in Canada
ISBN: 978-0-470-17638-2
Manufactured in the United States of America
10 9 8 7 6 5 4 3 2 1
No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form
or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee
to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-8600 Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at www.wiley.com/go/permissions.
Limit of Liability/Disclaimer of Warranty: The publisher and the author make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation warranties of fitness for a particular purpose No warranty may be created or extended by sales or promotional materials The advice and strategies contained herein may not be suitable for every situation This work is sold with the understanding that the publisher is not engaged in rendering legal, accounting, or other professional services If professional assistance is required, the services of a competent professional person should be sought Neither the publisher nor the author shall be liable for damages arising herefrom The fact that an organization or Web site is referred to in this work as a citation and/or a potential source of further information does not mean that the author or the publisher endorses the information the organization
or Web site may provide or recommendations it may make Further, readers should be aware that Internet Web sites listed in this work may have changed or disappeared between when this work was written and when it is read.
For general information on our other products and services please contact our Customer Care Department within the United States at (877) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.
Library of Congress Cataloging-in-Publication Data
Wiley also publishes its books in a variety of electronic formats Some content that appears in print may not be available in electronic books.
Trang 7Laura L Reeves started designing and implementing data warehouse tions in 1986 Since then she has been involved in hundreds of projects She hasextensive experience in end-to-end data warehouse development, includingdeveloping comprehensive project plans, collecting business requirements,developing business dimensional models, designing database schemas (bothstar and snowflake designs), and developing enterprise data warehouse archi-tecture and strategies These have been implemented for many businessfunctions for private and public industry.
solu-Laura co-founded StarSoft Solutions, Inc., in 1995 and has been a facultymember with The Data Warehousing Institute since 1997 She is a contributing
author of Building a Data Warehouse for Decision Support (Prentice Hall, 1996) and a co-author of the first edition of The Data Warehouse Lifecycle Toolkit (Wiley,
1998) Laura graduated magna cum laude from Alma College with a bachelor
of science degree in mathematics and computer science, with departmentalhonors
Trang 11I have been very blessed with great family, friends, and colleagues I would like
to thank the many clients and colleagues who have challenged me, pushed me,and collaborated with me on so many initiatives over the years I appreciatethe opportunity to work with such high-quality people I want to acknow-ledge the contributions that have been made to the data warehousing industryand to me personally by the amazing people who worked at Metaphor I want
to express my gratitude to my dear friend and colleague Paul Kautza for hisbelief in me and for all his hard work all these years
Thanks are also due to the dedicated staff at Wiley who believed in me andhad great patience to help see this project through Thanks to Bob Elliott forbeing the impetus to get this project started and to Sara Shlaer and RosanneKoneval for their detailed efforts to produce a quality product I want toexpress appreciation to Cindi Howson for her insight on business intelligencetools
I want to extend a sincere and special thank you to Jonathon Geiger for hismeticulous comments and suggestions
I also want to thank two very special people who have provided unflaggingsupport and encouragement every step of the way: my friends Ingrid Korband Paula Johnson I am not sure I could have done this without you!
Of course, none of this would be possible without the dedication, sacrifice,love, and support given to me by my family: Mark, Ryan, Michael, and Leah
Trang 13Introduction xxiii
The Essentials of Data Warehousing 3
Differences Between Operational and DW Systems 4The Data Warehousing Environment 4
Understanding Industry Perspectives 7Design and Development Sequence 8
The Value of Data Warehousing 12The Promises of Data Warehousing 15
Believing the Myth: ‘‘If You Build It, They Will Come’’ 22Falling into the Project Deadline Trap 23
Trang 14Failing to Uphold Organizational Discipline 23Lacking Business Process Change 24Narrowing the Focus Too Much 25
Relying on the Technology Fix 27Getting the Right People Involved 28Finding Lost Institutional Knowledge 29
Chapter 2 The Executive’s FAQ for Data Warehousing 31
Question: What is the business benefit of a data warehouse? 32
Question: How do we get started and stay focused? 47
Trang 15Part Two The Business Side of Data Warehousing 49 Chapter 3 Understanding Where You Are and Finding Your Way 51
What Is Your Company’s Strategic Direction? 52What Are the Company’s Top Initiatives? 54
Does the Business Place Value on Analysis? 56Reflecting on Your Data Warehouse History 57Understanding Your Existing Reporting Environment 58Finding the Reporting Systems 59
Identifying the Business Purpose 61Discovering the Data You Already Have 63
Tracking Technology and Tools 65Understanding Enterprise Resources 66
The Call Center Data Warehouse Project 70
What a Partnership Really Means 75What the Business Partners Should Expect to Do 76Business Executives and Senior Management 78The Executive Business Sponsor 78
Helping the Business Analyst Deal with Change 85
Trang 16ETL Developer(s) 93Business Intelligence Application Developer 94
Tips for Building and Sustaining a Partnership 95Leveraging External Consulting 97Building Strong Project Teams 98
Presenting in Business Terms 100
Executive Steering Committee 104
Setting Up the Project Charter 112
Developing a Statement of Work 117
Peeling Back the Layers of Requirements Gathering 134
Trang 17Who Provides Input? 137Who Gathers the Requirements? 137Providing Business Requirements 138
Systems and Technical Requirements 147Communicating What You Really Need 149What Else Would Help the Project Team? 150Data Integration Challenges 151Assess Organizational Motivation 151Complete Picture of the Data 152
Practical Techniques for Gathering Requirements 153Interview Session Characteristics 153
Preparing for Interview Sessions 157Conducting the Interview Sessions 157Capturing Content: Notes vs Tapes 157
Individual Interview Documentation 159
Developing Functional Specifications 164
Trang 18Setting Attainable Goals 166
A Glimpse into Giant Company 170
The Purpose of Dimensional Models 176
Using Both Parts of the Model 180Implementing a Dimensional Model 181Diagramming Your Dimensional Model 182The Business Dimensional Model 182
Call Center Time Tracking Fact Group 196
Business Dimensional Model Index 200
Trang 19Guidelines for a Single Fact Group 203Characteristics of the Model across the Enterprise 204Business Participation in the Modeling Process 205
Preparing for Modeling Sessions 205Brainstorming the Framework 206Drafting the Initial Dimensions 206Drafting the Initial Fact Groups 207
Logging Questions and Issues 208Building the Business Measures Worksheet 209Preliminary Source to Target Data Map 211Completing or Fleshing Out the Model 211
Completing the Documentation 212Working Through All the Data Elements 212
Business Reviews of the Model 213
Expanding Business Data Over Time 215
Reflecting on Business Realities: Advanced Concepts 216Supporting Multiple Perspectives: Multiple Hierarchies 216Tracking Changes in the Dimension: Slowly Changing
Depicting the Existence of a Relationship: Factless Fact Tables 218Linking Parts of a Transaction: Degenerate Dimensions 219Pulling Together Components: Junk Dimensions 221Multiple Instances of a Dimension: Role Playing 222
Clusters of Future Attributes 225
Translating the Business Dimensional Model 226
Trang 20Chapter 8 Managing Data As a Corporate Asset 231
What Is Information Management? 232Information Management Example—Customer Data 235
IM Beyond the Data Warehouse 239
Master Data Feeds the Data Warehouse 242
Your Responsibilities If You Are ‘‘the Owner’’ 246What are IT’s Responsibilities? 247Challenges with Data Ownership 247
How Clean Does the Data Really Need to Be? 250
Managing the Integrity of Data Integration 254Quality Improves When It Matters 256Example: Data Quality and Grocery Checkout
Develop a Realistic Strategy 268Sharing the Information Management Strategy 269Setting Up a Sustainable Process 270
The Data Governance Committee 270
Trang 21In Real Life 271
Chapter 9 Architecture, Infrastructure, and Tools 277
Components of DW Data Architecture 285
A Closer Look at Common Data Warehouse Architectures 286Bottom-Up Data Architecture 286
Publish the Data: Data Marts 294
Requests for Information or Proposals 305Business Participation in the Selection Process 305Understanding Product Genealogy 306Understanding Value and Evaluating Your Options 306Cutting through the Marketing Hype 308
Making Architecture Work for You 310
Trang 22Chapter 10 Implementation: Building the Database 315
Extract, Transform, and Load (ETL) Fundamentals 315
Why Does the Business Need to Help? 323
Defining Expected Results—The Test Plan 325
Testing the ETL System—Is the data Right? 326Why Does It Take So Long and Cost So Much? 327Balancing Requirements and Data Reality 329Discovering the Flaws in Your Current Systems 330
Working Toward Long-Term Solutions 332Manually Including Business Data 333Tracking Progress—Are We There Yet? 333
Ensuring Continued Business Participation 335
What Is Business Intelligence? 341Business Intelligence without a DW 342
Trang 23Presentation—How Do You Want to See Results? 347Delivery—How Do You Receive the Results? 351Supporting Different Levels of Use 352Construction of the BI Solution 354Planning for Business Change 354Design—What Needs to Be Delivered? 355
Learning to Use the Data without a Technical Degree 362
Learning about the BI Tool/Application 362Ensuring That the Right Help Is Available 363
Chapter 12 Managing the Production Data Warehouse 369
Recapping the BI Application Launch 369
Looking Back—Did you Accomplish Your Objectives? 371
Getting the Rest of the Business Community on Board 372
Streamlining Business Processes 374
Staffing Production Activities 376
Monitoring Performance and Capacity Planning 378Maintaining the Data Warehouse 380
Maintaining the BI Application 381
Trang 24Tracking Questions and Problems 382
When the Data Warehouse Falls Short 384Common Causes for a Stalled Warehouse 385Jump-Starting a Stalled Data Warehouse 388
Determining What Can Be Salvaged 389Developing a Plan to Move On 390Aligning DW Objectives with Business Goals 391
Launching the Improved Data Warehouse and BI Solution 393
Lack of Support for the Production DW at Giant Co 394Unleashing BI at Agile, Inc 395
Planning for Expansion and Growth 397Exploring Expansion Opportunities 398
Managing Enterprise DW Resources 400Creating an Enterprise Data Warehouse Team 400The Centralized Enterprise Data Warehouse Team 401The Virtual Enterprise Data Warehouse Team 401Enterprise DW Team Responsibilities 403Funding the Enterprise DW Team 404
Embedded Business Intelligence 405Operational Business Intelligence 406
Monitoring Industry Innovation 409
Measuring Success One Step at a Time 410Adjusting Expectations to Reality 412
Trang 25Many executives, managers, business analysts, and nontechnical personnel arehighly motivated to learn more about data warehousing They want to under-stand what data warehouses are and how they work More important, manyare truly interested in doing their part to ensure success when implementing
a data warehouse in their company They are not interested in learning how
to write code or tune a database
Unfortunately, most data warehouse publications available today are ten for the people who design and build them Some are from a projectmanagement perspective and others provide a great deal of technical depth.While these are very valuable to the technical team, they do not help thenontechnical audience This book was written to provide a resource for thosenontechnical people
writ-Overview of the Book
The information in this book has been gathered over years of working ondata warehouse projects Hundreds of hours have been invested in learningwhat works well and what does not One constant thread over the years isthe need to develop and strengthen the partnership between business andsystems personnel There has always been a need to help nontechnical people
in the organization understand the different parts of a data warehouse andwhat needs to be done to build and maintain one
This book covers the topics and questions that come up repeatedly inexecutive briefings, classes, meetings, and casual conversation It also includescoverage of topics related to how organizations frequently get into trouble
Trang 26The goal is to minimize the frustration of all participants and to ultimately helporganizations to build and maintain valuable data warehouse environments.The book provides a sound introduction to data warehousing conceptsand then moves on to explain the process of developing and creating a datawarehouse project A description of each step is provided, including detailsabout how the business should participate This book does not provide thetechnical level of detail that IT team members need to know, such as specificcoding techniques or how to best set parameters to improve performance with
a specific technology The book does provide what is needed to understandthese areas so that you are better prepared to have clear communication acrossthe organization By knowing what is being designed or developed, you can
be more effective in sharing your ideas, requirements, and concerns
More technical readers will benefit by gaining a complete picture of datawarehousing, with an emphasis on how representatives from the businesscommunity can help you be more successful This will provide you with ideasabout how to interact more clearly with the business community Use this as
a resource; highlight the most pertinent chapters or sections that would bebeneficial for your business counterparts to read
How This Book Is Organized
This book is divided into five major parts Each part focuses on a specificaspect of data warehousing
The first part shares the essentials of data warehousing This can helpexecutives and managers get a realistic understanding of the fundamen-tals and gain insight into both the most common factors for success andhow to avoid roadblocks
The second part of the book takes a look at the business management of
a data warehouse This includes how to cultivate a strong partnershipbetween the business and IT communities, what is entailed in setting
up a data warehouse project, and how to effectively communicate yourbusiness requirements for the data warehouse It is just as important fordifferent business groups to work together in order to support a consistentview of the data
The third part of the book focuses on the data itself How the data isorganized is critical to ensure that you, the business, can understand andexploit what is available Without the right data, presented in a usefulmanner, the data warehouse will not be used, which jeopardizes theentire investment This section of the book gives you the tools you need toensure that the data meets your needs Beyond how data is processed or
Trang 27stored, the issues surrounding how organizations view data ownershipand management are also addressed.
Part four delves into data warehouse architecture, what is involved toefficiently build the data warehouse, and finally what must be done todeliver the data While most of the work during this part of a project
is more technical in nature, there is still a need for active businessparticipation This section will help managers and business participantsunderstand what work is being done and how they can help
Part five explores what is needed to launch the data warehouse andwrap up the project Looking beyond a single project, the work needed tomaintain and grow the data warehouse is discussed The causes of, andrecommendations to address, a stalled data warehouse are presented Adata warehouse can have initial success that decreases over time unlessspecific steps are taken Suggestions are provided that can help yousustain your success
Who Should Read This Book
The target audience for this book is anyone who is responsible for, working
on, or paying for a data warehouse In particular, it is targeted towardnontechnical readers in order to help them understand the basics of datawarehousing, and, more importantly, how to successfully build one For themore technical readers, this book provides the tools you need in order to beable to communicate with your business partners
The information is presented in a layered manner The basic concepts arepresented early in the book, while more in-depth coverage is provided in laterchapters This book is designed to serve a variety of levels of need Somereaders may benefit from reading only the first part, returning later to learnmore about a specific area Other readers may benefit by reading all of thecontent from start to finish
While all readers could benefit from reading the entire text, I realize that noteveryone has the same passion for data warehousing Realistically, differenttypes of readers will benefit from various parts of the book:
Executives and senior managers should read Part 1 Then, based uponwhat the organization is facing, subsequent chapters may be worthwhile
as different issues crop up For example, if your company has problemsgetting the data loaded into the data warehouse, it would be helpful toread Chapter 10 to get a better understanding of what is really involved
in building the database
Trang 28Middle managers, both business and IT, will benefit from Part 1 but willalso find Part 2 to be important These parts help you get the project set
up properly and, most important, learn about how to provide businessrequirements Skimming the rest of the book can help all managers learnwhat is available to help them with the rest of the project Then, as needed,the specific chapters can be studied in detail when the organization isworking on that part of a data warehouse initiative
Business personnel involved with a data warehouse can skim the chapters
on setting up a project, but will find it helpful to study in depth Chapter 6
to learn about providing requirements, Chapter 7 to understand how thedata should be organized, and Chapter 11 to learn about how the datacan be delivered It is recommended that the rest of the book be reviewed
to familiarize yourself with the content Specific chapters can be studied
in more detail when the organization is working on that area
Everyone on the data warehouse project team should read this entirebook It can provide a common ground for dialogue and more meaningfuldiscussions between the business and technical personnel The projectteam can provide this book to their business counterparts and suggestspecific chapters for them to read to better support the project
Technical staff can also benefit from this book, which can help anyonewith in-depth technical knowledge to communicate with their managersand business counterparts about data warehousing This provides thebackdrop for all interactions with the business community
When designed and implemented properly, a data warehouse is a valuabletool for an organization to improve how it runs Getting the right designrequires active participation of knowledgeable business personnel Keepingthings on track requires the support of middle managers to ensure thateverything is progressing smoothly Executives and senior managers will beable to ask meaningful questions about data warehouse projects and be able tounderstand the answers All of these things require that business and technicalstaff have a common understanding of the different parts of a data warehouseand what is involved in a successful project It is hoped that this book providesthe foundation you and your organization need to achieve success with all ofyour data warehouse initiatives
Trang 29Data Warehousing
Trang 31The Essentials of Data
Warehousing
In This Part
Chapter 1:Gaining Data Warehouse Success
Chapter 2:The Executive’s FAQ for Data Warehousing
Trang 33There is often a disconnect between the technical side that builds andmaintains a data warehouse and the business side that will use it Thisbook will help bridge that gap Both business managers and IT managerswill learn what is involved with building and deploying a successful datawarehouse Executives and senior managers will also find this book helpful,especially Part 1, in order to be able to provide effective oversight and support.This book will also be beneficial for all business and technical personnelinvolved with a data warehouse, providing a common foundation for bettercommunication Managers on both sides need the knowledge and informationthat will enable them to help their organization build and use a data warehousemost effectively, and this book is the path to that knowledge.
This chapter explains the value of a data warehouse and highlights what
is needed for success To help frame the discussion, the chapter begins withsome definitions
The Essentials of Data Warehousing
Data warehousing is not new Most large organizations have been investing
in data warehousing for years Currently, cost-effective technology is creatingmore possibilities for small and medium-size companies to build and deploydata warehouse solutions too There are many stories about wild successes, andjust as many about failed projects With so much buzz about data warehousing,
it is often assumed that everyone already knows the basics However, many
Trang 34people are being exposed to these concepts for the first time To ensure acommon understanding, it is worth taking the time to boil things down to theessence of data warehousing.
What Is a Data Warehouse?
A data warehouse (DW) is the collection of processes and data whose overarching
purpose is to support the business with its analysis and decision-making Inother words, it is not one thing per se, but a collection of many differentparts Before looking more closely at the specific parts of a data warehouseenvironment, it is helpful to compare the characteristics and purpose of a datawarehouse with an operational application system
Differences Between Operational and DW Systems
Applications that run the business are called online transaction processing systems
(OLTPs) OLTP systems are geared toward functions such as processing
incoming orders, getting products shipped out, and transferring funds asrequested These applications must ensure that transactions are handledaccurately and efficiently No one wants to wait minutes to get cash from
an automated teller machine, or to enter sales orders into a company’s system
In contrast, the purpose and characteristics of a data warehousing ronment are to provide data in a format easily understood by the businesscommunity in order to support decision-making processes The data ware-house supports looking at the business data over time to identify significanttrends in buying behavior, customer retention, or changes in employee pro-ductivity Table 1-1 lays out the primary differences between these two types
envi-of systems
The inherent differences between the functions performed in OLTP and DWsystems result in methodology, architecture, tool, and technology differences.Data warehousing emerged as an outgrowth of necessity, but has blossomedinto a full-fledged industry that serves a valuable function in the businesscommunity
Now that the differences between data warehouse and OLTP systems havebeen reviewed, it is time to look deeper into the makeup of the data warehouseitself
The Data Warehousing Environment
There are many different parts of a data warehouse environment, whichencompasses everything from where the data lives today through where it isultimately used on reports and for analysis Each of the main parts of the data
Trang 35warehousing environment, shown in Figure 1-1, are described in the followingsections This figure indicates how the data flows throughout the environment.
Table 1-1 Comparison of Online Transaction Processing and Data Warehousing Systems
AREA OF OLTP DATA WAREHOUSING
COMPARISON
System
purpose
Support operational processes
Support strategic analysis, performance, and exception reporting
Data usage Capture and maintain the
data
Exploit the data
Data validation Data verification occurs upon
Data is updated by periodic, scheduled processes
Trang 36Source systems, shown on the left side of Figure 1-1, are where data is created
or collected by operational application systems that run the business Theseare often large applications that have been in place for a long time Examples
of source systems include the following:
community The entire process is referred to as the extract, transform, and load
(ETL) process.
The database in which the data is organized to support the business is called
a data mart A data mart includes all of the data that is loaded into a single
database and used together for analysis Data marts are often developed tomeet the needs of a business group such as marketing or finance The key
to a successful data mart is to create it in an integrated manner It is alsorecommended that data be loaded into only one data mart and then sharedacross the organization to ensure data consistency
Finally, an application or reporting layer is provided to facilitate access andanalysis of the data This is where business users access reports, dashboards,and analytical applications Collections of these reports and analyses are called
business intelligence.
There is one more critical concept that warrants some attention: the
mecha-nism used to help organize data, which is called a data model.
What Is a Data Model?
A data model is an abstraction of how individual data elements relate toeach other It visually depicts how the data is to be organized and stored
in a database A data model provides the mechanism for documenting andunderstanding how data is organized
There are many different types of data modeling, each with a specificgoal and purpose As organizations modified how data was structured to
Trang 37support reporting and analysis, a new data modeling technique, now called
dimensional modeling, emerged Ralph Kimball, a pioneer in data warehousing,
can be credited with crystallizing these techniques and publishing them for thebenefit of the industry (For more information about dimensional modeling,see Chapter 7.)
The data and processes to perform the work shown in Figure 1-1 are tively called the data warehouse These basic concepts have been fine-tunedand relabeled by many different players in the data warehousing field Thetwo most common philosophies are discussed in the next section
collec-Understanding Industry Perspectives
At the end of the day, everyone faces the same challenge: getting the data intothe hands of the business user to turn it into information that can be used tomake decisions The definitions provided so far provide the backdrop for howterms are used in this book There are many brilliant and talented people inthe data warehousing industry, many of whom have different philosophiesabout how to design and build a data warehouse It is worthwhile to lookmore closely at two of the most frequently used philosophies
The first is from Ralph Kimball and colleagues, as described in The Data
Warehouse Lifecycle Toolkit, Second Edition (Wiley, 2008) Ralph Kimball is a
clear thought leader in the data warehousing industry and has written severalbooks that provide detailed information essential for practitioners That bookdescribes the enterprise data warehouse as:
The complete end-to-end data warehouse and business intelligence system (DW/BI
System) Although some would argue that you can theoretically deliver business
intelligence without a data warehouse and vice versa, that is ill-advised from
our perspective Linking the two together in the DW/BI acronym reinforces their
dependency Independently, we refer to the queryable data in your DW/BI system
as the enterprise data warehouse, and value-add analytics as BI (business
intelligence) applications.
A second definition worth looking at is from Bill Inmon, a prolific author
and another leader in the data warehousing industry, from Building the Data
Warehouse, Fourth Edition (Wiley, 2005):
The data warehouse is a collection of integrated subject-oriented databases
designed to support the DSS (decision support system) function, where each unit
of data is relevant to some moment in time The data warehouse contains atomic
data and lightly summarized data.
Bill’s definition is also described and expanded by Claudia Imhoff and
colleagues in Mastering Data Warehouse Design (Wiley, 2003):
Trang 38It [the DW] is the central point of data integration for business intelligence and is the source of data for the data marts, delivering a common view of enterprise data.
This second viewpoint is incomplete without also including their definition
of a data mart Again, expanding on Bill Inmon’s definition, Claudia states in
Mastering Data Warehouse Design:
A data mart is a departmentalized structure of data feeding from the data warehouse where data is denormalized [organized] based on the department’s need for information It utilizes a common enterprise view of strategic data and provides business units with more flexibility, control, and responsibility The data mart may or may not be on the same server or location as the data warehouse.
To bring this second viewpoint into the proper context, Mastering Data
Warehouse Design further defines business intelligence:
Business intelligence is the set of processes and data structures used to analyze
data and information used in strategic decision support The components of Business Intelligences are the data warehouse, data marts, the DSS (decision support system) interface and the processes to ‘get data in’ to the data warehouse and to ‘get information out’.
The single definition provided by Ralph Kimball is comprehensive Youmust look at the full set of definitions set forth by Bill Inmon and ClaudiaImhoff to fully understand their perspective There is much more commonground between these differing philosophies than perceived at first glance.While there are distinct differences, the common theme is that data ware-housing must provide the method to prepare and deliver data to the businesscommunity to support reporting and analysis Chapter 9 provides a compre-hensive discussion about these different approaches to data warehousing.The key point here is that there are multiple ways these terms can beinterpreted Understanding which definition is being used is critical to beingable to understand what is being discussed and worked on An organizationcan avoid confusion by selecting one set of definitions to be used, whichenables everyone to use a common language
Regardless of labels and terminology, all data warehouse initiatives aretrying to accomplish the same thing Now that the basic parts of the data ware-house have been defined, it is time to look at the order in which they are created
Design and Development Sequence
Earlier in this chapter, you looked at how data flows through the data house environment While this correctly illustrates how data flows in the
Trang 39ware-completed environment, this is not the recommended sequence for designingand developing a data warehouse A better way to design the environment is
to start from the business user perspective Figure 1-2 shows the correct order
to successfully design and implement a data warehousing environment Boththe technical and business team members play a role throughout Chapter
4 describes the different roles and responsibilities Each step in the designprocess is described as follows:
1 An understanding of what the business is trying to accomplish and howsuccess is measured should be the foundation for all data warehousinginitiatives The starting point for designing the data warehouse is withthe business community Chapter 6 covers what you need to know toeffectively provide business requirements
2 Once the business requirements are understood, the data in the lying source systems needs to be studied Many business people have avision for what they want to do, but it is not always tied to the reality ofthe organization’s actual data In preparation for modeling data, Chapter
under-7 introduces techniques to help you understand your data
Source System Data
Source System Data
Source System Data
Design &
build processes
Input to Data Delivery Design
Business Question or
Problem
Data Organized
to Support the Business
Source System Data
Design
Figure 1-2 Optimal data warehouse design and development sequence
Trang 403 The foundation for successful data warehousing, now and into the future,
is properly structuring the data Data must be organized to support thebusiness perspective This provides ease of use and improved queryperformance This design is created based on a knowledge of the businessrequirements, as well as the reality of the existing data Chapter 7 focuses
on how the business and technical team members can work together todevelop appropriate data models for this data delivery layer
4 After defining how the data will be organized, the design for getting thedata from the source systems to the database can be created Decisionsabout the architecture and tools needed to prepare the data can be made
in the proper context Too often these decisions are made before youknow what is to be delivered Chapter 9 provides the background needed
to understand data warehouse architecture and technology Chapter 10deals with the challenges of preparing the data for business reportingand analytical use
5 While the data is being prepared, the data access and application layercan be designed This includes the design of basic reports, businessintelligence, and analytical applications, and performance dashboards orother end user tools Chapter 11 focuses on final delivery of data to thebusiness community
Many different project methodologies are available for all systems’ opment efforts There are even multiple methodologies specifically targetedtoward data warehousing These have evolved over several decades Mostorganizations already have adopted some type of project methodology orproject life cycle It is important to understand how your organization runsprojects to ensure that the data warehouse project is adhering to the strategicdirection for all information systems Several basic building blocks are found
devel-in any methodology These primary components, discussed throughout thebook, are as follows:
Project definition, planning and managementis outlined in Chapters 4and 5
Defining business requirementsis discussed in Chapter 6
Designing the data delivery databaseis covered in Chapter 7
Defining the architectureis discussed in Chapter 9 and includes opment of the database
devel-Processes for building the databaseare reviewed in Chapter 10
Developing reports/analyses and providing education/support is sented in Chapter 11 and includes deployment of the final results