1. Trang chủ
  2. » Giáo Dục - Đào Tạo

the data warehousing ebusiness dba handbook

220 909 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề The Data Warehouse eBusiness DBA Handbook
Tác giả Donald K. Burleson, Joseph Hudicka, William H. Inmon, Craig Mullins, Fabian Pascal
Định dạng
Số trang 220
Dung lượng 3,67 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

The data warehouse is at the center of an infrastructure called the "corporate information factory." Figure 2 shows the corporate information factory and the Web environment.. The corpor

Trang 4

The Data Warehouse eBusiness DBA

Handbook

Donald K Burleson Joseph Hudicka

Craig Mullins Fabian Pascal

Trang 6

The Data Warehouse eBusiness DBA

Handbook

By Donald K Burleson, Joseph Hudicka, William H Inmon, Craig Mullins, Fabian Pascal

Copyright © 2003 by BMC Software and DBAzine Used with permission

Printed in the United States of America

Series Editor: Donald K Burleson

Production Manager: John Lavender

Production Editor: Teri Wade

Cover Design: Bryan Hoff

Printing History:

August, 2003 for First Edition

Oracle, Oracle7, Oracle8, Oracle8i and Oracle9i are trademarks of Oracle Corporation

Many of the designations used by computer vendors to distinguish their products are claimed as Trademarks All names known to Rampant TechPress to be trademark names appear in this text as initial caps

The information provided by the authors of this work is believed to be accurate and reliable, but because of the possibility of human error by our authors and staff, BMC Software, DBAZine and Rampant TechPress cannot guarantee the accuracy or completeness of any information included in this work and is not responsible for any errors, omissions or inaccurate results obtained from the use of information or scripts in this work

Links to external sites are subject to change; DBAZine.com, BMC Software and Rampant TechPress do not control or endorse the content of these external web sites, and are not responsible for their content

ISBN 0-9740716-2-5

iii The Data Warehousing eBusiness DBA Handbook

Trang 7

Table of Contents

Conventions Used in this Book ix

About the Authors xi

Foreword xiii

Chapter 1 - Data Warehousing and eBusiness 1

Making the Most of E-business by W H Inmon 1

Chapter 2 - The Benefits of Data Warehousing 9

The Data Warehouse Foundation by W H Inmon 9

References 18

Chapter 3 - The Value of the Data Warehouse 19

The Foundations of E-Business by W H Inmon 19

Why the Internet? 19

Intelligent Messages 20

Integration, History and Versatility 21

The Value of Historical Data 22

Integrated Data 23

Looking Smarter 26

Chapter 4 - The Role of the eDBA 28

Logic, e-Business, and the Procedural eDBA by Craig S Mullins 28

The Classic Role of the DBA 28

The Trend of Storing Process With Data 30

Database Code Objects and e-Business 32

Database Code Object Programming Languages 34

The Duality of the DBA 35

The Role of the Procedural DBA 37

Synopsis 38

Chapter 5 - Building a Solid Information Architecture 39

iv The Data Warehousing eBusiness DBA Handbook How to Select the Optimal Information Exchange Architecture by Joseph Hudicka 39

Trang 8

Introduction 39

The Main Variables to Ponder 40

Data Volume 40

Available System Resources 41

Transformation Requirements 41

Frequency 41

Optimal Architecture Components 42

Conclusion 42

Chapter 6 - Data 101 43

Getting Down to Data Basics by Craig S Mullins 43

Data Modeling and Database Design 43

Physical Database Design 45

The DBA Management Discipline 46

The 17 Skills Required of a DBA 47

Meeting the Demand 51

Chapter 7 - Designing Efficient Databases 52

Design and the eDBA by Craig S Mullins 52

Living at Web Speed 52

Database Design Steps 54

Database Design Traps 57

Taming the Hostile Database 59

Chapter 8 - The eBusiness Infrastructure 61

E-Business and Infrastructure by W H Inmon 61

Chapter 9 - Conforming to Your Corporate Structure 68

Integrating Data in the Web-Based E-Business Environment by W H Inmon 68

Chapter 10 - Building Your Data Warehouse 77

The Issues of the E-Business Infrastructure by W H Inmon 77 Large Volumes of Data 79

Performance 83

Integration 85

Trang 9

Addressing the Issues 87

Chapter 11 - The Importance of Data Quality Strategy 88

Develop a Data Quality Strategy Before Implementing a Data Warehouse by Joseph Hudicka 88

Data Quality Problems in the Real World 88

Why Data Quality Problems Go Unresolved 89

Fraudulent Data Quality Problems 90

The Seriousness of Data Quality Problems 91

Data Collection 92

Solutions for Data Quality Issues 92

Option 1: Integrated Data Warehouse 92

Option 2: Value Rules 94

Option 3: Deferred Validation 94

Periodic sampling averts future disasters 94

Conclusion 96

Chapter 12 - Data Modeling and eBusiness 97

Data Modeling for the Data Warehouse by W H Inmon 97

"Just the Facts, Ma'am" 97

Modeling Atomic Data 98

Through Data Attributes, Many Classes of Subject Areas Are Accumulated 100

Other Possibilities - Generic Data Models 103

Design Continuity from One Iteration of Development to the Next 104

Chapter 13 - Don't Forget the Customer 105

Interacting with the Internet Viewer by W H Inmon 105

IN SUMMARY 113

Chapter 14 - Getting Smart 114

Elasticity and Pricing: Getting Smart by W H Inmon 114

Historically Speaking 114

At the Price Breaking Point 116

vi The Data Warehousing eBusiness DBA Handbook How Good Are the Numbers 117

Trang 10

How Elastic Is the Price 118

Conclusion 120

Chapter 15 - Tools of the Trade: Java 121

The eDBA and Java by Craig S Mullins 121

What is Java? 121

Why is Java Important to an eDBA? 122

How can Java improve availability? 123

How Will Java Impact the Job of the eDBA? 124

Resistance is Futile 127

Conclusion 128

Chapter 16 - Tools of the Trade: XML 129

New Technologies of the eDBA: XML by Craig S Mullins 129 What is XML? 129

Some Skepticism 132

Integrating XML 133

Defining the Future Web 134

Chapter 17 - Multivalue Database Technology Pros and Cons 136

MultiValue Lacks Value by Fabian Pascal 136

References 144

Chapter 18 - Securing your Data 146

Data Security Internals by Don Burleson 146

Traditional Oracle Security 147

Concerns About Role-based Security 150

Closing the Back Doors 151

Oracle Virtual Private Databases 152

Procedure Execution Security 158

Conclusion 160

Chapter 19 - Maintaining Efficiency 162

eDBA: Online Database Reorganization by Craig S Mullins 162 Reorganizing Tablespaces 166

Trang 11

Online Reorganization 167

Synopsis 168

Chapter 20 - The Highly Available Database 170

The eDBA and Data Availability by Craig S Mullins 170

The First Important Issue is Availability 171

What is Implied by e-vailability? 171

The Impact of Downtime on an e-business 175

Conclusion 176

Chapter 21 - eDatabase Recovery Strategy 177

The eDBA and Recovery by Craig S Mullins 177

eDatabase Recovery Strategies 179

Recovery-To-Current 181

Point-in-Time Recovery 183

Transaction Recovery 184

Choosing the Optimum Recovery Strategy 188

Database Design 189

Reducing the Risk 189

Chapter 22 - Automating eDBA Tasks 191

Intelligent Automation of DBA Tasks by Craig S Mullins 191

Duties of the DBA 192

A Lot of Effort 194

Intelligent Automation 195

Synopsis 196

Chapter 23 - Where to Turn for Help 197

Online Resources of the eDBA by Craig S Mullins 197

Usenet Newsgroups 197

Mailing Lists 200

Websites and Portals 201

No eDBA Is an Island 203

viii The Data Warehousing eBusiness DBA Handbook

Trang 12

Conventions Used in this Book

It is critical for any technical publication to follow rigorous standards and employ consistent punctuation conventions to make the text easy to read

However, this is not an easy task Within Oracle there are many types of notation that can confuse a reader Some Oracle utilities such as STATSPACK and TKPROF are always spelled

in CAPITAL letters, while Oracle parameters and procedures have varying naming conventions in the Oracle documentation

It is also important to remember that many Oracle commands are case sensitive, and are always left in their original executable form, and never altered with italics or capitalization

Hence, all Rampant TechPress books follow these conventions:

Parameters - All Oracle parameters will be lowercase italics

Exceptions to this rule are parameter arguments that are commonly capitalized (KEEP pool, TKPROF), these will be left in ALL CAPS

Variables – All PL/SQL program variables and arguments will

also remain in lowercase italics (dbms_job, dbms_utility)

Tables & dictionary objects – All data dictionary objects are

referenced in lowercase italics (dba_indexes, v$sql) This includes all v$ and x$ views (x$kcbcbh, v$parameter) and dictionary views (dba_tables, user_indexes)

SQL – All SQL is formatted for easy use in the code depot,

and all SQL is displayed in lowercase The main SQL terms (select, from, where, group by, order by, having) will always appear on a separate line

Trang 13

Programs & Products – All products and programs that are

known to the author are capitalized according to the vendor specifications (IBM, DBXray, etc) All names known by Rampant TechPress to be trademark names appear in this text as initial caps References to UNIX are always made in uppercase

x The Data Warehousing eBusiness DBA Handbook

Trang 14

About the Authors

Bill Inmon is universally recognized as the "father of the data

warehouse." He has more than 26 years of database technology management experience and data warehouse design expertise, and has published 36 books and more than

350 articles in major computer journals He is known globally for his seminars on developing data warehouses and has been a keynote speaker for many major computing associations Inmon has consulted with a large number of Fortune 1000 clients, offering data warehouse design and database management services For more information, visit www.BillInmon.com or call (303) 221-4000

Joseph Hudicka is the founder of the Information Architecture

Team, an organization that specializes in data quality, data migration, and ETL Winner of the ODTUG Best Speaker award for the Spring 1999 conference, Joseph is an internationally recognized speaker at ODTUG, OOW, IOUG-A, TDWI and many local user groups Joseph coauthored Oracle8 Design Using UML Object Modeling for Osborne/McGraw-Hill & Oracle Press, and has also written or contributed to several articles for publication in

DMReview, Intelligent Enterprise and The Data Warehousing Institute (TDWI)

Craig S Mullins is a director of technology planning for BMC

Software He has over 15 years of experience dealing with data and database technologies He is the author of the book

DB2 Developer's Guide (now available in a fourth edition that

covers up to and includes the latest release of DB2 -Version 6) and is working on a book about database administration practices (to be published this year by Addison Wesley)

Trang 15

Craig can be reached via his Website at

www.craigsmullins.com or at craig_mullins@bmc.com

Fabian Pascal has a national and international reputation as an

independent technology analyst, consultant, author and

lecturer specializing in data management He was affiliated

with Codd & Date and for 20 years held various analytical

and management positions in the private and public sectors,

has taught and lectured at the business and academic levels,

and advised vendor and user organizations on data

management technology, strategy and implementation

Clients include IBM, Census Bureau, CIA, Apple, Borland,

Cognos, UCSF, and IRS He is founder, editor and publisher

of DATABASE DEBUNKINGS (http://www.dbdebunk.com/), a Web site dedicated to

dispelling persistent fallacies, flaws, myths and

misconceptions prevalent in the IT industry (Chris Date is a

senior contributor) Author of three books, he has published

extensively in most trade publications, including DM Review,

Database Programming and Design, DBMS, Byte, Infoworld and

Computerworld He is author of the contrarian columns Against

the Grain, Setting Matters Straight, and for The Journal of

Conceptual Modeling His third book, Practical Issues in Database

MANAGEMENT serves as text for his seminars

xii The Data Warehousing eBusiness DBA Handbook

Trang 16

Foreword

With the advent of cheap disk I/O subsystems, it is finally possible for database professionals to have databases store multiple billions and even multiple trillions of bytes of information As the size of these databases increases to behemoth proportions, it is the challenge of the database professionals to understand the correct techniques for loading, maintaining, and extracting information from very large database management systems The advent of cheap disks has also led to an explosion in business technology, where even the most modest financial investment can bring forth an online system with many billions of bytes It is imperative that the business manager understand how to manage and control large volumes of information while at the same time provide the consumer with high-volume throughput and sub-second response time

This book provides you with insight into how to build the foundation of your eBusiness application You’ll learn the importance of the Data Warehouse in your daily operations You’ll gain lots of insight into how to properly design and build your information architecture to handle the rapid growth that eCommerce business sees today Once your system is up and running, it must be maintained There is information in this text that goes through how to maintain online data systems to reduce downtime Keeping your online data secure is another big issue with online business To wrap things up, you’ll get links to some of the best online resources on Data Warehousing

The purpose of this book is to give you significant insights into how you can manage and control large volumes of data As the

Trang 17

technology has expanded to support terabyte data capacity, the challenge to the database professionals is to understand effective techniques for the loading and maintaining of these very large database systems This book brings together some of the world's foremost authors on data warehousing in order to provide you with the insights that you need to be successful in your data warehousing endeavors

xiv The Data Warehousing eBusiness DBA Handbook

Trang 18

1

Data Warehousing

and eBusiness

CHAPTER

Making the Most of E-business

Everywhere you look today, you see e-business In the trade

journals On TV In the Wall Street Journal Everywhere And

the message is that if your business is not e-business enabled, that you will be behind the curve

So what is all the fuss about? Behind the corporate push to get into e-business is a Web site Or multiple Web sites The Web site allows your corporation to have a reach into the marketplace that is direct and far reaching Businesses that would never have entertained entry to foreign marketplaces and other marketplaces that are hard to access suddenly have easy and cheap presence In a word, e-business opens up possibilities that previously were impractical or even impossible

So the secret to e-business is a Web site Right? Well almost Indeed, a Web site is a wonderful delivery mechanism The Web site allows you to go where you might not have ever been able to go before But after all is said and done, a Web site is merely a delivery mechanism To be effective, the delivery mechanism must be allied with application of strong business propositions There is a way of expressing this opportunity = delivery mechanism + business proposition

Trang 19

Figure 1: The web site is at the heart of e-Business

To illustrate the limitations of a Web site, consider the personal Web sites that many people have created If there were any inherent business advantage to having a Web site, then these personal sites would be achieving business results for their owners But no one thinks that just putting up a Web site produces results It is what you do with the Web site that counts

To exploit the delivery mechanism that is the Web environment, applications are necessary There are many kinds

of applications that can be adapted to the Web environment But the most potent, most promising applications are a class that are called Customer Relationship Management (CRM) applications CRM applications have the capability of producing very important business results Executed properly, CRM applications:

protect market share

gain new market share

increase revenues

increase profits

2 The Data Warehousing eBusiness DBA Handbook

Trang 20

And there's not a business around that doesn't want to do these things

So what kind of applications are we talking about here? There are many different flavors Typical CRM applications include: yield management

credit scoring, and so forth

In short, there are many different ways that applications can be created to absolutely maximize the effectiveness of the Web Stated differently, without these applications, the Web environment is just another Web site

And there are other related non-CRM applications that can improve the bottom line of business as well These applications include:

quality control

profitability analysis

destination analysis (for airlines)

purchasing consolidation, and the like

Trang 21

In short, once the Web is enabled by supporting applications, then very real business advantage occurs

But applications do not just happen by themselves Applications such as CRM and others are built on a foundation

of data called a data warehouse The data warehouse is at the center of an infrastructure called the "corporate information factory." Figure 2 shows the corporate information factory and the Web environment

Figure 2: Sitting behind the web site is the infrastructure called the

"corporate information factory"

Figure 2 shows that the Web environment serves as a conduit into the corporate information factory The corporate information factory provides a variety of important functions for the Web environment:

4 The Data Warehousing eBusiness DBA Handbook

Trang 22

the corporate information factory enables the Web environment to gather and manage an unlimited amount of data

the corporate information factory creates and environment where sweeping business patterns can be detected and analyzed

the corporate information factory provides a place where Web-based data can be integrated with other corporate data the corporate information factory makes edited and integrated data quickly available to the Web environment, and so forth

In a word, the corporate information factory provides the background infrastructure that turns the Web from a delivery mechanism into a truly powerful tool The different components of the corporate information factory are:

the data warehouse

the corporate ODS

integrated data

historical data

corporate data

A convenient way to think of the data warehouse is as a structure that contain very fine grains of sand Different

Trang 23

applications take those grains of sand and reshape them into the form and structure that is most familiar to the organization

One of the issues that frequently arises with applications for the Web is whether it is necessary to have a data warehouse in support of the applications Strictly speaking, it is not necessary

to have a data warehouse in support of the applications that run on the Web Figure 3 shows that different applications have been built from the legacy foundation

Figure 3: Building applications without a data warehouse

6 The Data Warehousing eBusiness DBA Handbook

Trang 24

In Figure 3, multiple applications have been built from the same supporting applications Looking at figure 3, it becomes clear that the same processing accessing data, gathering data, editing data, cleansing data, merging data and integrating data are done for every application Almost all of the processing shown is redundant There is no need for every application to repeat what every other application has done Figure 4 shows that by building a data warehouse, the repetitive activities are done just once

Figure 3: Building a data warehouse for the different applications

Trang 25

In figure 4, the infrastructure activities of accessing data, gathering data, editing data, cleansing data, merging data and integrating data are done once The savings are obvious But there are some other powerful reasons why building a data warehouse makes sense:

when it comes time to build a new application, with a data warehouse in place the application can be constructed quickly; with no data warehouse in place, the infrastructure has to be built again

if there is a discrepancy in values, with a data warehouse those values can be resolved easily and quickly

the resources required for access of legacy data are minimal when there is a data warehouse; when there is no data warehouse, the resources required for the access of legacy data grow with each new application, and so forth

In short, when an organization takes a long-term perspective, the data warehouse at the center of the corporate information factory is the only way to fly

It is intuitively obvious that a foundation of integrated historical granular data is useful for competitive advantage But one step beyond intuition, the question must be asked exactly how can integrated historical data be turned into competitive advantage It is the purpose of the articles to follow to explain how integrated historical data can be turned into competitive advantage and how that competitive advantage can be delivered through the Web

8 The Data Warehousing eBusiness DBA Handbook

Trang 26

The Benefits of Data

Warehousing

CHAPTER

2

The Data Warehouse Foundation

The Web-based e-business environment has tremendous potential The Web is a tremendously powerful medium for delivery of information But there is nothing intrinsically powerful about the Web other than its ability to deliver information In order for the Web-based e-business environment to deliver its full potential, the Web-based environment requires an infrastructure in support of its information processing needs The infrastructure that best supports the Web is called the corporate information factory

At the center of the corporate information factory is a data warehouse

Fig 1 shows the basic infrastructure supporting the Web-based e-business environment

Trang 27

Figure 1: the web environment and the supporting infrastructure

The heart of the corporate information factory is the data warehouse The data warehouse is the place where corporate granular integrated historical data resides

The data warehouse serves many functions, but the most important function it serves is that of making information available cheaply and quickly Stated differently, without a data warehouse the cost of information goes sky high and the length

of time required to get information is exceedingly long If the Web-based e-business environment is to be successful, it is necessary to have information that is cheap to access and immediately available

How does the data warehouse lower the cost of getting information? And how does the data warehouse greatly accelerate the speed with which information is available? These

10 The Data Warehousing eBusiness DBA Handbook

Trang 28

issues are not immediately obvious when looking at the structure of the corporate information factory

In order to explain how the data warehouse accomplishes its important functions, consider the seemingly innocent request for information in a manufacturing environment where there is

no data warehouse A financial analyst wants to find out what corporate sales were for the last quarter Is this a reasonable request for information? Absolutely Now, what is required to get that information?

Figure 2: getting information from applications

Fig 2 shows that many different sources have to be accessed to get the desired information Some of the data is in IMS; some is

in VSAM Yet other files are in ADABAS The key structure of the European file is different from the key structure of the Asian file The parts data uses different closing dates than the truck data The body design for cars is called one thing in the cars file and another thing in the parts file To get the required information takes lots of analysis, access to 10 programs and the ability to integrate the data Moreover, it takes six months

to deliver the information at a cost of $250,000

Trang 29

These numbers are typical for a mid-sized to large corporation

In some cases these numbers are very much understated But the real issue isn't the costs and length of time required for accessing data The real issue is how many resources are needed for accessing many units of information

Fig 3 shows that seven different types of information have been requested

Figure 3: getting information from applications for seven different reports

The costs that were described for Fig 2 now are multiplied by seven (or whatever number of units of data are required) As the analyst is developing the procedures for getting the unit of information required, no thought is given to getting information for other units of information Therefore each

12 The Data Warehousing eBusiness DBA Handbook

Trang 30

time a new piece of information is required, the process described in Fig 2 begins all over again AS a result, the cost of information spikes dramatically

But suppose, for example, that this organization had a data warehouse And suppose the organization had a request for seven units of information What would it cost to get that information and how long would it take?

Fig 4 illustrates this scenario

Figure 4: making a report from a data warehouse

Once the data warehouse is built, it can serve multiple requests for information The granular integrated data that resides in the data warehouse is ideal for being shaped and reshaped One analyst can look at the data one way; another analyst can look

at the same data in yet another way And you only have to create the infrastructure once The financial analyst may spend

30 minutes tracking down a unit of data, such as consolidated sales Or if the data is difficult to calculate it may take a day to get the job done Depending on the complexity and how costs are calculated, it may cost from between $100 to $1000 to

Trang 31

access the data Compare that price range to what it might cost

at an organization with no data warehouse, and it becomes obvious why a data warehouse makes data available quickly and cheaply

Of course the real difference between having a data warehouse and not having one lies in not having to build the infrastructure required for accessing the data With a data warehouse, you build the infrastructure only once With no data warehouse, you have to build at least part of the infrastructure every time you want new data

In reality, however, no company goes looking for just one piece

of data In fact, it's quite the opposite - most companies require many forms of data And the need for new forms and structures of data is recreated every day When it comes to looking at the larger picture - not the cost of data for a single item, but for the cost of data for all data - the data warehouse greatly eases the burden placed on the information systems organization Fig 5 shows the difference between having a data warehouse and not having a data warehouse in the case of finding multiple types of data

14 The Data Warehousing eBusiness DBA Handbook

Trang 32

Figure 5: making seven reports from a data warehouse

Looking at Fig 5, it's obvious that a data warehouse really does lower the cost of getting information and greatly accelerates the rate at which data can be found

But organizations have a habit of not looking at the big picture, preferring instead to focus on immediate needs They look only

up to next Tuesday and not an hour beyond it What do sighted organizations see? The comparison between the data warehouse infrastructure and the need for a single unit of information Fig 6 shows this comparison

Trang 33

Figure 6: when all you are looking at is a single report it appears that it

is more expensive to get it from applications directly and not build a data warehouse

When looking at the diagram in Fig 6, the short-term approach

of not building a data warehouse is attractive The organization thinks only of the quick fix And in the very short term, it is less expensive just to dive in and get data from applications without building a data warehouse There are a hundred excuses the corporation has for not looking to the long term: The data warehouse is so big

We heard that data warehouses don't really work

All we need is some quick and dirty information

I don't have time to build a data warehouse

If I build a data warehouse and pay for it, one of my neighbors will use the data later on and they don't have to pay for it, and so forth

As long as a corporation insists on having nothing but a term focus, it will never build a data warehouse But the minute the corporation takes a long-term look, the future becomes an entirely different picture Fig 7 shows the long-term focus

short-16 The Data Warehousing eBusiness DBA Handbook

Trang 34

Figure 7: when you look at the larger picture you see that building a data warehouse saves huge amounts of resources

Fig 7 shows that when the long-term needs for information are considered, the data warehouse is far and away the less expensive than the series of short term efforts And the length

of time for access to information is an intangible whose worth

is difficult to measure No one argues that information today, right now is much more effective than information six months from now In fact, six months from now I will have forgotten why I wanted the information in the first place You simply cannot beat a data warehouse for speed and ease of access of information

The Web environment, then, is a most promising environment But in order to unlock the potential of the Web, information must be freely and cheaply available The supporting infrastructure of the data warehouse provides that foundation and is at the heart of the effectiveness of the Web environment

Trang 35

References

Inmon, W H - The Corporate Information Factory, 2nd edition,

John Wiley, NY, NY 2000

Inmon, W H - Building the Data Warehouse, 2nd edition, John

Wiley, NY, NY 1998

Inmon, W H - Building the Operational Data Store, 2nd edition,

John Wiley, NY, NY 1999

Inmon, W H - Exploration Warehousing, John Wiley, NY, NY

2000

Website - www.BILLINMON.COM, a site containing useful information about architecture, data models, articles, presentations, white papers, near line storage, exploration warehousing, methodologies and other important topics

18 The Data Warehousing eBusiness DBA Handbook

Trang 36

The Value of the Data

Warehouse

CHAPTER

3

The Foundations of E-Business

The basis for a long-term, sound e-business competitive advantage is the data warehouse

Why the Internet?

Consider the Internet When you get down to it, what is the Internet good for? It is good for connectivity, and with connectivity comes opportunity - the opportunity to sell somebody something, to help someone, to get a message across But at the same time, connectivity is ALL the Internet provides In order to take advantage of that connectivity, the real competitive advantage is found in the content and presentation of the messages that are passed along the lines of connectivity

Consider the telephone Before the advent of the telephone, getting a message to someone was accomplished by mail or shouting Then when the telephone appeared, it was possible to have cheap and instant access to someone But merely making

a quick call becomes a trite act The important thing about making a telephone call quickly is what you say to the person, not the fact that you did it cheaply and quickly The message delivered over the phone becomes the essence, not the phone itself

With the phone you can:

Trang 37

ask your girlfriend out for Saturday night

tell the county you aren't available for jury duty

call in sick for work and go play golf

find out if it had snowed in Aspen last night

call the doctor, and so forth

The real value of the phone is the communication of the message

The same is true of the Internet Today, people are enamored

of the novelty of the ability to communicate instantaneously But where commercial advantage is concerned, the real value of the Internet lies in the messages that are passed through cyberspace, not in the novelty of the passage itself

Intelligent Messages

To give your messages sent via the Internet some punch, you need intelligence behind them And the basis of that intelligence is the information that is buried in a data warehouse

Why is the data warehouse the basis of business intelligence? Simple With a data warehouse, you have two facets of information that have otherwise not been available: integration and history In years past, application systems have been built

in which each application considered only its own set of requirements One application thought of a customer as one thing, another application thought of a customer as something else There was no integration - no cohesive understanding of information - from one application to the next

20 The Data Warehousing eBusiness DBA Handbook

Trang 38

And the applications of yesterday paid no mind to history The applications of yesterday looked only at what was happening right now Ask a bank what your bank account balance is today and they can tell you But ask them what your average balance has been over the past twelve months and they have no idea

Integration, History and Versatility

The essence of data warehousing is integration and history Integration is achieved by the messy task of going back into older legacy systems and pulling out data that was a by-product

of transaction processing, and converting and integrating that data Integrating old legacy data is a dirty, thankless task that nobody wants to undertake, but the rewards of integration are worth the time and effort Historical data is achieved by organizing and collecting the integrated data over time Data is time-stamped and stored at the detailed level

Once an organization has a carefully crafted collection of integrated detailed historical data, it is in a position of great strength The first real value to the collection of data - a data warehouse - is the versatility of the data The data can be organized a certain way on one day and another way the next Marketing can look at customers by state or by month, Sales can look at sales transactions per day, and Accounting can look

at closed business by country or by quarter - all from the same store of data A top manager can walk in at 8:00 am and decide that he or she wants to look at the world in a manner no one else has thought of and the integrated, detailed historical data will allow that to happen Done properly, the manager can have his or her report by 5:00 p.m that same afternoon

So the first tremendous business value that a data warehouse brings is the ability to look at data any way that is useful But

Trang 39

looking at data internally doesn't really have anything to do with e-business or the Internet And the data warehouse has tremendous advantages there

How do the Internet and the data warehouse work together to produce a business advantage? The Internet provides connectivity and the data warehouse produces continuity

The Value of Historical Data

Consider the value of historical data when it comes to understanding a customer When you have historical data about customers, you have the key to understanding their future behavior Why? Because people are creatures of habit with predictable life patterns The habits that we form early in our life stick with us throughout our life The clothes we wear, the place we live, the food we eat, the cars we drive, how we pay our bills, how we invest, where we go on vacation - all of these features are set early in our adulthood Understanding a customer's past history then becomes a tremendous predictor

of the future

Customers are subject to patterns In our youth, most of us don't have much money to invest But as we get older, we have more disposable income At mid-life, our children start looking for colleges At late mid-life, we start thinking about retirement

In short, there are predictable patterns of behavior that practically everyone experiences Knowing the history of your customer allows you to predict what the next pattern of behavior will be

What happens when you can predict your customer's behavior? Basically, you're in a position to package products and tailor them to your customers Having historical data that resides in a

22 The Data Warehousing eBusiness DBA Handbook

Trang 40

data warehouse lets you do exactly that Through the Internet, you reach the customer Then, the data warehouse tells you what you to say to the customer to get his or her attention The information in the data warehouse allows you to craft a message that your customer wants to hear

Integrated Data

Integrated data has a related but different effect Suppose you are a salesperson wanting to sell something (it really doesn't matter what) Your boss gives you a list and says go to it Here's your list:

Now somebody suggests that you get a little integrated data You don't know exactly what that is, but anything is better than beating your head against a wall So now you have a list of very basic integrated data:

acct 123 - John Smith - male

acct 234 - Mary Jones - female

acct 345 - Tom Watson - male

acct 456 - Chris Ng - female

acct 567 - Pat Wilson - male

acct 678 - Sam Freed - female

This simple integrated data makes your life as a salesperson a littler simpler You know not to sell bras to a male or cigars to a female (or at least not to most females.) Your sales productivity

Ngày đăng: 06/07/2014, 15:31

TỪ KHÓA LIÊN QUAN