1. Trang chủ
  2. » Công Nghệ Thông Tin

A Managers Guide to Data warehousing docx

482 481 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề A Manager’s Guide to Data Warehousing
Tác giả Laura L. Reeves
Trường học Wiley Publishing, Inc.
Thể loại nghiên cứu về quản lý
Định dạng
Số trang 482
Dung lượng 3,77 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

The book covers: • The most common factors for ensuring data warehousing success and the roadblocks that can prevent it • How to ensure that business and technical staff have a common u

Trang 1

T IMELY P RACTICAL R ELIABLE.

Laura L Reeves

A Manager’s Guide to

Data Warehousing

Wiley Computer Publishing Timely Practical Reliable.

An ideal guide for the non-technical professional eager to learn more

about data warehousing

each step of a data warehouse project, and provides a clear explanation of

what’s involved in efficiently building

a data warehouse and what must be done to deliver the data You’ll examine

the business management of a data warehouse and discover essential

methods for cultivating a strong partnership between the business and IT

elements of your organization You can use this knowledge to be more effective

when sharing your requirements and concerns during a project

A Manager’s Guide to Data Warehousing

explains what you need to create your data warehouse and establish long-term

success The book covers:

• The most common factors for ensuring data warehousing

success and the roadblocks that can prevent it

• How to ensure that business and technical staff have a common

understanding of the data warehouse project

Database/Data Warehousing

LAURA L REEVES, coauthor of The

Data Warehouse Lifecycle Toolkit,

has over 23 years of experience

in end-to-end data warehouse development focused on developing

comprehensive project plans, collecting business requirements,

designing business dimensional models and database schemas, and

creating enterprise data warehouse strategies and data architectures.

A successful data warehouse project

can provide immense value for business

enterprises or other organizations

Building and maintaining a data

warehouse demands the combined

efforts of both IT and non-technical

personnel While there are plenty of

resources aimed at the technology

professionals who design and build data

warehouses, there has to date been no

useful guide written for a non-technical

audience This book fills that void and

serves as an ideal resource for business

and IT managers and others from the

non-IT side who want to do their part to

ensure data warehousing success.

This helpful book provides a solid

introduction to the fundamentals of

data warehousing The author details

the data warehouse

• The tools you need to make certain that data is organized

and can be delivered as needed

• Ways to deploy the data warehouse and ensure

sustainable success

Reeves

ISBN: 978-0-470-17638-2

Trang 3

Data Warehousing

Trang 6

Wiley Publishing, Inc.

10475 Crosspoint Boulevard

Indianapolis, IN 46256

www.wiley.com

Published simultaneously in Canada

ISBN: 978-0-470-17638-2

Manufactured in the United States of America

10 9 8 7 6 5 4 3 2 1

No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form

or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee

to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-8600 Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at www.wiley.com/go/permissions.

Limit of Liability/Disclaimer of Warranty: The publisher and the author make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation warranties of fitness for a particular purpose No warranty may be created or extended by sales or promotional materials The advice and strategies contained herein may not be suitable for every situation This work is sold with the understanding that the publisher is not engaged in rendering legal, accounting, or other professional services If professional assistance is required, the services of a competent professional person should be sought Neither the publisher nor the author shall be liable for damages arising herefrom The fact that an organization or Web site is referred to in this work as a citation and/or a potential source of further information does not mean that the author or the publisher endorses the information the organization

or Web site may provide or recommendations it may make Further, readers should be aware that Internet Web sites listed in this work may have changed or disappeared between when this work was written and when it is read.

For general information on our other products and services please contact our Customer Care Department within the United States at (877) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.

Library of Congress Cataloging-in-Publication Data

Wiley also publishes its books in a variety of electronic formats Some content that appears in print may not be available in electronic books.

Trang 7

Laura L Reeves started designing and implementing data warehouse tions in 1986 Since then she has been involved in hundreds of projects She hasextensive experience in end-to-end data warehouse development, includingdeveloping comprehensive project plans, collecting business requirements,developing business dimensional models, designing database schemas (bothstar and snowflake designs), and developing enterprise data warehouse archi-tecture and strategies These have been implemented for many businessfunctions for private and public industry.

solu-Laura co-founded StarSoft Solutions, Inc., in 1995 and has been a facultymember with The Data Warehousing Institute since 1997 She is a contributing

author of Building a Data Warehouse for Decision Support (Prentice Hall, 1996) and a co-author of the first edition of The Data Warehouse Lifecycle Toolkit (Wiley,

1998) Laura graduated magna cum laude from Alma College with a bachelor

of science degree in mathematics and computer science, with departmentalhonors

Trang 11

I have been very blessed with great family, friends, and colleagues I would like

to thank the many clients and colleagues who have challenged me, pushed me,and collaborated with me on so many initiatives over the years I appreciatethe opportunity to work with such high-quality people I want to acknow-ledge the contributions that have been made to the data warehousing industryand to me personally by the amazing people who worked at Metaphor I want

to express my gratitude to my dear friend and colleague Paul Kautza for hisbelief in me and for all his hard work all these years

Thanks are also due to the dedicated staff at Wiley who believed in me andhad great patience to help see this project through Thanks to Bob Elliott forbeing the impetus to get this project started and to Sara Shlaer and RosanneKoneval for their detailed efforts to produce a quality product I want toexpress appreciation to Cindi Howson for her insight on business intelligencetools

I want to extend a sincere and special thank you to Jonathon Geiger for hismeticulous comments and suggestions

I also want to thank two very special people who have provided unflaggingsupport and encouragement every step of the way: my friends Ingrid Korband Paula Johnson I am not sure I could have done this without you!

Of course, none of this would be possible without the dedication, sacrifice,love, and support given to me by my family: Mark, Ryan, Michael, and Leah

Trang 13

Introduction xxiii

The Essentials of Data Warehousing 3

Differences Between Operational and DW Systems 4The Data Warehousing Environment 4

Understanding Industry Perspectives 7Design and Development Sequence 8

The Value of Data Warehousing 12The Promises of Data Warehousing 15

Believing the Myth: ‘‘If You Build It, They Will Come’’ 22Falling into the Project Deadline Trap 23

Trang 14

Failing to Uphold Organizational Discipline 23Lacking Business Process Change 24Narrowing the Focus Too Much 25

Relying on the Technology Fix 27Getting the Right People Involved 28Finding Lost Institutional Knowledge 29

Chapter 2 The Executive’s FAQ for Data Warehousing 31

Question: What is the business benefit of a data warehouse? 32

Question: How do we get started and stay focused? 47

Trang 15

Part Two The Business Side of Data Warehousing 49 Chapter 3 Understanding Where You Are and Finding Your Way 51

What Is Your Company’s Strategic Direction? 52What Are the Company’s Top Initiatives? 54

Does the Business Place Value on Analysis? 56Reflecting on Your Data Warehouse History 57Understanding Your Existing Reporting Environment 58Finding the Reporting Systems 59

Identifying the Business Purpose 61Discovering the Data You Already Have 63

Tracking Technology and Tools 65Understanding Enterprise Resources 66

The Call Center Data Warehouse Project 70

What a Partnership Really Means 75What the Business Partners Should Expect to Do 76Business Executives and Senior Management 78The Executive Business Sponsor 78

Helping the Business Analyst Deal with Change 85

Trang 16

ETL Developer(s) 93Business Intelligence Application Developer 94

Tips for Building and Sustaining a Partnership 95Leveraging External Consulting 97Building Strong Project Teams 98

Presenting in Business Terms 100

Executive Steering Committee 104

Setting Up the Project Charter 112

Developing a Statement of Work 117

Peeling Back the Layers of Requirements Gathering 134

Trang 17

Who Provides Input? 137Who Gathers the Requirements? 137Providing Business Requirements 138

Systems and Technical Requirements 147Communicating What You Really Need 149What Else Would Help the Project Team? 150Data Integration Challenges 151Assess Organizational Motivation 151Complete Picture of the Data 152

Practical Techniques for Gathering Requirements 153Interview Session Characteristics 153

Preparing for Interview Sessions 157Conducting the Interview Sessions 157Capturing Content: Notes vs Tapes 157

Individual Interview Documentation 159

Developing Functional Specifications 164

Trang 18

Setting Attainable Goals 166

A Glimpse into Giant Company 170

The Purpose of Dimensional Models 176

Using Both Parts of the Model 180Implementing a Dimensional Model 181Diagramming Your Dimensional Model 182The Business Dimensional Model 182

Call Center Time Tracking Fact Group 196

Business Dimensional Model Index 200

Trang 19

Guidelines for a Single Fact Group 203Characteristics of the Model across the Enterprise 204Business Participation in the Modeling Process 205

Preparing for Modeling Sessions 205Brainstorming the Framework 206Drafting the Initial Dimensions 206Drafting the Initial Fact Groups 207

Logging Questions and Issues 208Building the Business Measures Worksheet 209Preliminary Source to Target Data Map 211Completing or Fleshing Out the Model 211

Completing the Documentation 212Working Through All the Data Elements 212

Business Reviews of the Model 213

Expanding Business Data Over Time 215

Reflecting on Business Realities: Advanced Concepts 216Supporting Multiple Perspectives: Multiple Hierarchies 216Tracking Changes in the Dimension: Slowly Changing

Depicting the Existence of a Relationship: Factless Fact Tables 218Linking Parts of a Transaction: Degenerate Dimensions 219Pulling Together Components: Junk Dimensions 221Multiple Instances of a Dimension: Role Playing 222

Clusters of Future Attributes 225

Translating the Business Dimensional Model 226

Trang 20

Chapter 8 Managing Data As a Corporate Asset 231

What Is Information Management? 232Information Management Example—Customer Data 235

IM Beyond the Data Warehouse 239

Master Data Feeds the Data Warehouse 242

Your Responsibilities If You Are ‘‘the Owner’’ 246What are IT’s Responsibilities? 247Challenges with Data Ownership 247

How Clean Does the Data Really Need to Be? 250

Managing the Integrity of Data Integration 254Quality Improves When It Matters 256Example: Data Quality and Grocery Checkout

Develop a Realistic Strategy 268Sharing the Information Management Strategy 269Setting Up a Sustainable Process 270

The Data Governance Committee 270

Trang 21

In Real Life 271

Chapter 9 Architecture, Infrastructure, and Tools 277

Components of DW Data Architecture 285

A Closer Look at Common Data Warehouse Architectures 286Bottom-Up Data Architecture 286

Publish the Data: Data Marts 294

Requests for Information or Proposals 305Business Participation in the Selection Process 305Understanding Product Genealogy 306Understanding Value and Evaluating Your Options 306Cutting through the Marketing Hype 308

Making Architecture Work for You 310

Trang 22

Chapter 10 Implementation: Building the Database 315

Extract, Transform, and Load (ETL) Fundamentals 315

Why Does the Business Need to Help? 323

Defining Expected Results—The Test Plan 325

Testing the ETL System—Is the data Right? 326Why Does It Take So Long and Cost So Much? 327Balancing Requirements and Data Reality 329Discovering the Flaws in Your Current Systems 330

Working Toward Long-Term Solutions 332Manually Including Business Data 333Tracking Progress—Are We There Yet? 333

Ensuring Continued Business Participation 335

What Is Business Intelligence? 341Business Intelligence without a DW 342

Trang 23

Presentation—How Do You Want to See Results? 347Delivery—How Do You Receive the Results? 351Supporting Different Levels of Use 352Construction of the BI Solution 354Planning for Business Change 354Design—What Needs to Be Delivered? 355

Learning to Use the Data without a Technical Degree 362

Learning about the BI Tool/Application 362Ensuring That the Right Help Is Available 363

Chapter 12 Managing the Production Data Warehouse 369

Recapping the BI Application Launch 369

Looking Back—Did you Accomplish Your Objectives? 371

Getting the Rest of the Business Community on Board 372

Streamlining Business Processes 374

Staffing Production Activities 376

Monitoring Performance and Capacity Planning 378Maintaining the Data Warehouse 380

Maintaining the BI Application 381

Trang 24

Tracking Questions and Problems 382

When the Data Warehouse Falls Short 384Common Causes for a Stalled Warehouse 385Jump-Starting a Stalled Data Warehouse 388

Determining What Can Be Salvaged 389Developing a Plan to Move On 390Aligning DW Objectives with Business Goals 391

Launching the Improved Data Warehouse and BI Solution 393

Lack of Support for the Production DW at Giant Co 394Unleashing BI at Agile, Inc 395

Planning for Expansion and Growth 397Exploring Expansion Opportunities 398

Managing Enterprise DW Resources 400Creating an Enterprise Data Warehouse Team 400The Centralized Enterprise Data Warehouse Team 401The Virtual Enterprise Data Warehouse Team 401Enterprise DW Team Responsibilities 403Funding the Enterprise DW Team 404

Embedded Business Intelligence 405Operational Business Intelligence 406

Monitoring Industry Innovation 409

Measuring Success One Step at a Time 410Adjusting Expectations to Reality 412

Trang 25

Many executives, managers, business analysts, and nontechnical personnel arehighly motivated to learn more about data warehousing They want to under-stand what data warehouses are and how they work More important, manyare truly interested in doing their part to ensure success when implementing

a data warehouse in their company They are not interested in learning how

to write code or tune a database

Unfortunately, most data warehouse publications available today are ten for the people who design and build them Some are from a projectmanagement perspective and others provide a great deal of technical depth.While these are very valuable to the technical team, they do not help thenontechnical audience This book was written to provide a resource for thosenontechnical people

writ-Overview of the Book

The information in this book has been gathered over years of working ondata warehouse projects Hundreds of hours have been invested in learningwhat works well and what does not One constant thread over the years isthe need to develop and strengthen the partnership between business andsystems personnel There has always been a need to help nontechnical people

in the organization understand the different parts of a data warehouse andwhat needs to be done to build and maintain one

This book covers the topics and questions that come up repeatedly inexecutive briefings, classes, meetings, and casual conversation It also includescoverage of topics related to how organizations frequently get into trouble

Trang 26

The goal is to minimize the frustration of all participants and to ultimately helporganizations to build and maintain valuable data warehouse environments.The book provides a sound introduction to data warehousing conceptsand then moves on to explain the process of developing and creating a datawarehouse project A description of each step is provided, including detailsabout how the business should participate This book does not provide thetechnical level of detail that IT team members need to know, such as specificcoding techniques or how to best set parameters to improve performance with

a specific technology The book does provide what is needed to understandthese areas so that you are better prepared to have clear communication acrossthe organization By knowing what is being designed or developed, you can

be more effective in sharing your ideas, requirements, and concerns

More technical readers will benefit by gaining a complete picture of datawarehousing, with an emphasis on how representatives from the businesscommunity can help you be more successful This will provide you with ideasabout how to interact more clearly with the business community Use this as

a resource; highlight the most pertinent chapters or sections that would bebeneficial for your business counterparts to read

How This Book Is Organized

This book is divided into five major parts Each part focuses on a specificaspect of data warehousing

The first part shares the essentials of data warehousing This can helpexecutives and managers get a realistic understanding of the fundamen-tals and gain insight into both the most common factors for success andhow to avoid roadblocks

The second part of the book takes a look at the business management of

a data warehouse This includes how to cultivate a strong partnershipbetween the business and IT communities, what is entailed in setting

up a data warehouse project, and how to effectively communicate yourbusiness requirements for the data warehouse It is just as important fordifferent business groups to work together in order to support a consistentview of the data

The third part of the book focuses on the data itself How the data isorganized is critical to ensure that you, the business, can understand andexploit what is available Without the right data, presented in a usefulmanner, the data warehouse will not be used, which jeopardizes theentire investment This section of the book gives you the tools you need toensure that the data meets your needs Beyond how data is processed or

Trang 27

stored, the issues surrounding how organizations view data ownershipand management are also addressed.

Part four delves into data warehouse architecture, what is involved toefficiently build the data warehouse, and finally what must be done todeliver the data While most of the work during this part of a project

is more technical in nature, there is still a need for active businessparticipation This section will help managers and business participantsunderstand what work is being done and how they can help

Part five explores what is needed to launch the data warehouse andwrap up the project Looking beyond a single project, the work needed tomaintain and grow the data warehouse is discussed The causes of, andrecommendations to address, a stalled data warehouse are presented Adata warehouse can have initial success that decreases over time unlessspecific steps are taken Suggestions are provided that can help yousustain your success

Who Should Read This Book

The target audience for this book is anyone who is responsible for, working

on, or paying for a data warehouse In particular, it is targeted towardnontechnical readers in order to help them understand the basics of datawarehousing, and, more importantly, how to successfully build one For themore technical readers, this book provides the tools you need in order to beable to communicate with your business partners

The information is presented in a layered manner The basic concepts arepresented early in the book, while more in-depth coverage is provided in laterchapters This book is designed to serve a variety of levels of need Somereaders may benefit from reading only the first part, returning later to learnmore about a specific area Other readers may benefit by reading all of thecontent from start to finish

While all readers could benefit from reading the entire text, I realize that noteveryone has the same passion for data warehousing Realistically, differenttypes of readers will benefit from various parts of the book:

Executives and senior managers should read Part 1 Then, based uponwhat the organization is facing, subsequent chapters may be worthwhile

as different issues crop up For example, if your company has problemsgetting the data loaded into the data warehouse, it would be helpful toread Chapter 10 to get a better understanding of what is really involved

in building the database

Trang 28

Middle managers, both business and IT, will benefit from Part 1 but willalso find Part 2 to be important These parts help you get the project set

up properly and, most important, learn about how to provide businessrequirements Skimming the rest of the book can help all managers learnwhat is available to help them with the rest of the project Then, as needed,the specific chapters can be studied in detail when the organization isworking on that part of a data warehouse initiative

Business personnel involved with a data warehouse can skim the chapters

on setting up a project, but will find it helpful to study in depth Chapter 6

to learn about providing requirements, Chapter 7 to understand how thedata should be organized, and Chapter 11 to learn about how the datacan be delivered It is recommended that the rest of the book be reviewed

to familiarize yourself with the content Specific chapters can be studied

in more detail when the organization is working on that area

Everyone on the data warehouse project team should read this entirebook It can provide a common ground for dialogue and more meaningfuldiscussions between the business and technical personnel The projectteam can provide this book to their business counterparts and suggestspecific chapters for them to read to better support the project

Technical staff can also benefit from this book, which can help anyonewith in-depth technical knowledge to communicate with their managersand business counterparts about data warehousing This provides thebackdrop for all interactions with the business community

When designed and implemented properly, a data warehouse is a valuabletool for an organization to improve how it runs Getting the right designrequires active participation of knowledgeable business personnel Keepingthings on track requires the support of middle managers to ensure thateverything is progressing smoothly Executives and senior managers will beable to ask meaningful questions about data warehouse projects and be able tounderstand the answers All of these things require that business and technicalstaff have a common understanding of the different parts of a data warehouseand what is involved in a successful project It is hoped that this book providesthe foundation you and your organization need to achieve success with all ofyour data warehouse initiatives

Trang 29

Data Warehousing

Trang 31

The Essentials of Data

Warehousing

In This Part

Chapter 1:Gaining Data Warehouse Success

Chapter 2:The Executive’s FAQ for Data Warehousing

Trang 33

There is often a disconnect between the technical side that builds andmaintains a data warehouse and the business side that will use it Thisbook will help bridge that gap Both business managers and IT managerswill learn what is involved with building and deploying a successful datawarehouse Executives and senior managers will also find this book helpful,especially Part 1, in order to be able to provide effective oversight and support.This book will also be beneficial for all business and technical personnelinvolved with a data warehouse, providing a common foundation for bettercommunication Managers on both sides need the knowledge and informationthat will enable them to help their organization build and use a data warehousemost effectively, and this book is the path to that knowledge.

This chapter explains the value of a data warehouse and highlights what

is needed for success To help frame the discussion, the chapter begins withsome definitions

The Essentials of Data Warehousing

Data warehousing is not new Most large organizations have been investing

in data warehousing for years Currently, cost-effective technology is creatingmore possibilities for small and medium-size companies to build and deploydata warehouse solutions too There are many stories about wild successes, andjust as many about failed projects With so much buzz about data warehousing,

it is often assumed that everyone already knows the basics However, many

Trang 34

people are being exposed to these concepts for the first time To ensure acommon understanding, it is worth taking the time to boil things down to theessence of data warehousing.

What Is a Data Warehouse?

A data warehouse (DW) is the collection of processes and data whose overarching

purpose is to support the business with its analysis and decision-making Inother words, it is not one thing per se, but a collection of many differentparts Before looking more closely at the specific parts of a data warehouseenvironment, it is helpful to compare the characteristics and purpose of a datawarehouse with an operational application system

Differences Between Operational and DW Systems

Applications that run the business are called online transaction processing systems

(OLTPs) OLTP systems are geared toward functions such as processing

incoming orders, getting products shipped out, and transferring funds asrequested These applications must ensure that transactions are handledaccurately and efficiently No one wants to wait minutes to get cash from

an automated teller machine, or to enter sales orders into a company’s system

In contrast, the purpose and characteristics of a data warehousing ronment are to provide data in a format easily understood by the businesscommunity in order to support decision-making processes The data ware-house supports looking at the business data over time to identify significanttrends in buying behavior, customer retention, or changes in employee pro-ductivity Table 1-1 lays out the primary differences between these two types

envi-of systems

The inherent differences between the functions performed in OLTP and DWsystems result in methodology, architecture, tool, and technology differences.Data warehousing emerged as an outgrowth of necessity, but has blossomedinto a full-fledged industry that serves a valuable function in the businesscommunity

Now that the differences between data warehouse and OLTP systems havebeen reviewed, it is time to look deeper into the makeup of the data warehouseitself

The Data Warehousing Environment

There are many different parts of a data warehouse environment, whichencompasses everything from where the data lives today through where it isultimately used on reports and for analysis Each of the main parts of the data

Trang 35

warehousing environment, shown in Figure 1-1, are described in the followingsections This figure indicates how the data flows throughout the environment.

Table 1-1 Comparison of Online Transaction Processing and Data Warehousing Systems

AREA OF OLTP DATA WAREHOUSING

COMPARISON

System

purpose

Support operational processes

Support strategic analysis, performance, and exception reporting

Data usage Capture and maintain the

data

Exploit the data

Data validation Data verification occurs upon

Data is updated by periodic, scheduled processes

Trang 36

Source systems, shown on the left side of Figure 1-1, are where data is created

or collected by operational application systems that run the business Theseare often large applications that have been in place for a long time Examples

of source systems include the following:

community The entire process is referred to as the extract, transform, and load

(ETL) process.

The database in which the data is organized to support the business is called

a data mart A data mart includes all of the data that is loaded into a single

database and used together for analysis Data marts are often developed tomeet the needs of a business group such as marketing or finance The key

to a successful data mart is to create it in an integrated manner It is alsorecommended that data be loaded into only one data mart and then sharedacross the organization to ensure data consistency

Finally, an application or reporting layer is provided to facilitate access andanalysis of the data This is where business users access reports, dashboards,and analytical applications Collections of these reports and analyses are called

business intelligence.

There is one more critical concept that warrants some attention: the

mecha-nism used to help organize data, which is called a data model.

What Is a Data Model?

A data model is an abstraction of how individual data elements relate toeach other It visually depicts how the data is to be organized and stored

in a database A data model provides the mechanism for documenting andunderstanding how data is organized

There are many different types of data modeling, each with a specificgoal and purpose As organizations modified how data was structured to

Trang 37

support reporting and analysis, a new data modeling technique, now called

dimensional modeling, emerged Ralph Kimball, a pioneer in data warehousing,

can be credited with crystallizing these techniques and publishing them for thebenefit of the industry (For more information about dimensional modeling,see Chapter 7.)

The data and processes to perform the work shown in Figure 1-1 are tively called the data warehouse These basic concepts have been fine-tunedand relabeled by many different players in the data warehousing field Thetwo most common philosophies are discussed in the next section

collec-Understanding Industry Perspectives

At the end of the day, everyone faces the same challenge: getting the data intothe hands of the business user to turn it into information that can be used tomake decisions The definitions provided so far provide the backdrop for howterms are used in this book There are many brilliant and talented people inthe data warehousing industry, many of whom have different philosophiesabout how to design and build a data warehouse It is worthwhile to lookmore closely at two of the most frequently used philosophies

The first is from Ralph Kimball and colleagues, as described in The Data

Warehouse Lifecycle Toolkit, Second Edition (Wiley, 2008) Ralph Kimball is a

clear thought leader in the data warehousing industry and has written severalbooks that provide detailed information essential for practitioners That bookdescribes the enterprise data warehouse as:

The complete end-to-end data warehouse and business intelligence system (DW/BI

System) Although some would argue that you can theoretically deliver business

intelligence without a data warehouse and vice versa, that is ill-advised from

our perspective Linking the two together in the DW/BI acronym reinforces their

dependency Independently, we refer to the queryable data in your DW/BI system

as the enterprise data warehouse, and value-add analytics as BI (business

intelligence) applications.

A second definition worth looking at is from Bill Inmon, a prolific author

and another leader in the data warehousing industry, from Building the Data

Warehouse, Fourth Edition (Wiley, 2005):

The data warehouse is a collection of integrated subject-oriented databases

designed to support the DSS (decision support system) function, where each unit

of data is relevant to some moment in time The data warehouse contains atomic

data and lightly summarized data.

Bill’s definition is also described and expanded by Claudia Imhoff and

colleagues in Mastering Data Warehouse Design (Wiley, 2003):

Trang 38

It [the DW] is the central point of data integration for business intelligence and is the source of data for the data marts, delivering a common view of enterprise data.

This second viewpoint is incomplete without also including their definition

of a data mart Again, expanding on Bill Inmon’s definition, Claudia states in

Mastering Data Warehouse Design:

A data mart is a departmentalized structure of data feeding from the data warehouse where data is denormalized [organized] based on the department’s need for information It utilizes a common enterprise view of strategic data and provides business units with more flexibility, control, and responsibility The data mart may or may not be on the same server or location as the data warehouse.

To bring this second viewpoint into the proper context, Mastering Data

Warehouse Design further defines business intelligence:

Business intelligence is the set of processes and data structures used to analyze

data and information used in strategic decision support The components of Business Intelligences are the data warehouse, data marts, the DSS (decision support system) interface and the processes to ‘get data in’ to the data warehouse and to ‘get information out’.

The single definition provided by Ralph Kimball is comprehensive Youmust look at the full set of definitions set forth by Bill Inmon and ClaudiaImhoff to fully understand their perspective There is much more commonground between these differing philosophies than perceived at first glance.While there are distinct differences, the common theme is that data ware-housing must provide the method to prepare and deliver data to the businesscommunity to support reporting and analysis Chapter 9 provides a compre-hensive discussion about these different approaches to data warehousing.The key point here is that there are multiple ways these terms can beinterpreted Understanding which definition is being used is critical to beingable to understand what is being discussed and worked on An organizationcan avoid confusion by selecting one set of definitions to be used, whichenables everyone to use a common language

Regardless of labels and terminology, all data warehouse initiatives aretrying to accomplish the same thing Now that the basic parts of the data ware-house have been defined, it is time to look at the order in which they are created

Design and Development Sequence

Earlier in this chapter, you looked at how data flows through the data house environment While this correctly illustrates how data flows in the

Trang 39

ware-completed environment, this is not the recommended sequence for designingand developing a data warehouse A better way to design the environment is

to start from the business user perspective Figure 1-2 shows the correct order

to successfully design and implement a data warehousing environment Boththe technical and business team members play a role throughout Chapter

4 describes the different roles and responsibilities Each step in the designprocess is described as follows:

1 An understanding of what the business is trying to accomplish and howsuccess is measured should be the foundation for all data warehousinginitiatives The starting point for designing the data warehouse is withthe business community Chapter 6 covers what you need to know toeffectively provide business requirements

2 Once the business requirements are understood, the data in the lying source systems needs to be studied Many business people have avision for what they want to do, but it is not always tied to the reality ofthe organization’s actual data In preparation for modeling data, Chapter

under-7 introduces techniques to help you understand your data

Source System Data

Source System Data

Source System Data

Design &

build processes

Input to Data Delivery Design

Business Question or

Problem

Data Organized

to Support the Business

Source System Data

Design

Figure 1-2 Optimal data warehouse design and development sequence

Trang 40

3 The foundation for successful data warehousing, now and into the future,

is properly structuring the data Data must be organized to support thebusiness perspective This provides ease of use and improved queryperformance This design is created based on a knowledge of the businessrequirements, as well as the reality of the existing data Chapter 7 focuses

on how the business and technical team members can work together todevelop appropriate data models for this data delivery layer

4 After defining how the data will be organized, the design for getting thedata from the source systems to the database can be created Decisionsabout the architecture and tools needed to prepare the data can be made

in the proper context Too often these decisions are made before youknow what is to be delivered Chapter 9 provides the background needed

to understand data warehouse architecture and technology Chapter 10deals with the challenges of preparing the data for business reportingand analytical use

5 While the data is being prepared, the data access and application layercan be designed This includes the design of basic reports, businessintelligence, and analytical applications, and performance dashboards orother end user tools Chapter 11 focuses on final delivery of data to thebusiness community

Many different project methodologies are available for all systems’ opment efforts There are even multiple methodologies specifically targetedtoward data warehousing These have evolved over several decades Mostorganizations already have adopted some type of project methodology orproject life cycle It is important to understand how your organization runsprojects to ensure that the data warehouse project is adhering to the strategicdirection for all information systems Several basic building blocks are found

devel-in any methodology These primary components, discussed throughout thebook, are as follows:

Project definition, planning and managementis outlined in Chapters 4and 5

Defining business requirementsis discussed in Chapter 6

Designing the data delivery databaseis covered in Chapter 7

Defining the architectureis discussed in Chapter 9 and includes opment of the database

devel-Processes for building the databaseare reviewed in Chapter 10

Developing reports/analyses and providing education/support is sented in Chapter 11 and includes deployment of the final results

Ngày đăng: 29/03/2014, 19:20

TỪ KHÓA LIÊN QUAN

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN