The data warehouse is at the center of an infrastructure called the "corporate information factory." Figure 2 shows the corporate information factory and the Web environment.. The corpor
Trang 4The Data Warehouse eBusiness DBA
Handbook
Donald K Burleson Joseph Hudicka
Craig Mullins Fabian Pascal
Trang 6The Data Warehouse eBusiness DBA
Handbook
By Donald K Burleson, Joseph Hudicka, William H Inmon, Craig Mullins, Fabian Pascal
Copyright © 2003 by BMC Software and DBAzine Used with permission
Printed in the United States of America
Series Editor: Donald K Burleson
Production Manager: John Lavender
Production Editor: Teri Wade
Cover Design: Bryan Hoff
Printing History:
August, 2003 for First Edition
Oracle, Oracle7, Oracle8, Oracle8i and Oracle9i are trademarks of Oracle Corporation
Many of the designations used by computer vendors to distinguish their products are claimed as Trademarks All names known to Rampant TechPress to be trademark names appear in this text as initial caps
The information provided by the authors of this work is believed to be accurate and reliable, but because of the possibility of human error by our authors and staff, BMC Software, DBAZine and Rampant TechPress cannot guarantee the accuracy or completeness of any information included in this work and is not responsible for any errors, omissions or inaccurate results obtained from the use of information or scripts in this work
Links to external sites are subject to change; DBAZine.com, BMC Software and Rampant TechPress do not control or endorse the content of these external web sites, and are not responsible for their content
ISBN 0-9740716-2-5
iii The Data Warehousing eBusiness DBA Handbook
Trang 7Table of Contents
Conventions Used in this Book ix
About the Authors xi
Foreword xiii
Chapter 1 - Data Warehousing and eBusiness 1
Making the Most of E-business by W H Inmon 1
Chapter 2 - The Benefits of Data Warehousing 9
The Data Warehouse Foundation by W H Inmon 9
References 18
Chapter 3 - The Value of the Data Warehouse 19
The Foundations of E-Business by W H Inmon 19
Why the Internet? 19
Intelligent Messages 20
Integration, History and Versatility 21
The Value of Historical Data 22
Integrated Data 23
Looking Smarter 26
Chapter 4 - The Role of the eDBA 28
Logic, e-Business, and the Procedural eDBA by Craig S Mullins 28
The Classic Role of the DBA 28
The Trend of Storing Process With Data 30
Database Code Objects and e-Business 32
Database Code Object Programming Languages 34
The Duality of the DBA 35
The Role of the Procedural DBA 37
Synopsis 38
Chapter 5 - Building a Solid Information Architecture 39
iv The Data Warehousing eBusiness DBA Handbook How to Select the Optimal Information Exchange Architecture by Joseph Hudicka 39
Trang 8Introduction 39
The Main Variables to Ponder 40
Data Volume 40
Available System Resources 41
Transformation Requirements 41
Frequency 41
Optimal Architecture Components 42
Conclusion 42
Chapter 6 - Data 101 43
Getting Down to Data Basics by Craig S Mullins 43
Data Modeling and Database Design 43
Physical Database Design 45
The DBA Management Discipline 46
The 17 Skills Required of a DBA 47
Meeting the Demand 51
Chapter 7 - Designing Efficient Databases 52
Design and the eDBA by Craig S Mullins 52
Living at Web Speed 52
Database Design Steps 54
Database Design Traps 57
Taming the Hostile Database 59
Chapter 8 - The eBusiness Infrastructure 61
E-Business and Infrastructure by W H Inmon 61
Chapter 9 - Conforming to Your Corporate Structure 68
Integrating Data in the Web-Based E-Business Environment by W H Inmon 68
Chapter 10 - Building Your Data Warehouse 77
The Issues of the E-Business Infrastructure by W H Inmon 77 Large Volumes of Data 79
Performance 83
Integration 85
Trang 9Addressing the Issues 87
Chapter 11 - The Importance of Data Quality Strategy 88
Develop a Data Quality Strategy Before Implementing a Data Warehouse by Joseph Hudicka 88
Data Quality Problems in the Real World 88
Why Data Quality Problems Go Unresolved 89
Fraudulent Data Quality Problems 90
The Seriousness of Data Quality Problems 91
Data Collection 92
Solutions for Data Quality Issues 92
Option 1: Integrated Data Warehouse 92
Option 2: Value Rules 94
Option 3: Deferred Validation 94
Periodic sampling averts future disasters 94
Conclusion 96
Chapter 12 - Data Modeling and eBusiness 97
Data Modeling for the Data Warehouse by W H Inmon 97
"Just the Facts, Ma'am" 97
Modeling Atomic Data 98
Through Data Attributes, Many Classes of Subject Areas Are Accumulated 100
Other Possibilities - Generic Data Models 103
Design Continuity from One Iteration of Development to the Next 104
Chapter 13 - Don't Forget the Customer 105
Interacting with the Internet Viewer by W H Inmon 105
IN SUMMARY 113
Chapter 14 - Getting Smart 114
Elasticity and Pricing: Getting Smart by W H Inmon 114
Historically Speaking 114
At the Price Breaking Point 116
vi The Data Warehousing eBusiness DBA Handbook How Good Are the Numbers 117
Trang 10How Elastic Is the Price 118
Conclusion 120
Chapter 15 - Tools of the Trade: Java 121
The eDBA and Java by Craig S Mullins 121
What is Java? 121
Why is Java Important to an eDBA? 122
How can Java improve availability? 123
How Will Java Impact the Job of the eDBA? 124
Resistance is Futile 127
Conclusion 128
Chapter 16 - Tools of the Trade: XML 129
New Technologies of the eDBA: XML by Craig S Mullins 129 What is XML? 129
Some Skepticism 132
Integrating XML 133
Defining the Future Web 134
Chapter 17 - Multivalue Database Technology Pros and Cons 136
MultiValue Lacks Value by Fabian Pascal 136
References 144
Chapter 18 - Securing your Data 146
Data Security Internals by Don Burleson 146
Traditional Oracle Security 147
Concerns About Role-based Security 150
Closing the Back Doors 151
Oracle Virtual Private Databases 152
Procedure Execution Security 158
Conclusion 160
Chapter 19 - Maintaining Efficiency 162
eDBA: Online Database Reorganization by Craig S Mullins 162 Reorganizing Tablespaces 166
Trang 11Online Reorganization 167
Synopsis 168
Chapter 20 - The Highly Available Database 170
The eDBA and Data Availability by Craig S Mullins 170
The First Important Issue is Availability 171
What is Implied by e-vailability? 171
The Impact of Downtime on an e-business 175
Conclusion 176
Chapter 21 - eDatabase Recovery Strategy 177
The eDBA and Recovery by Craig S Mullins 177
eDatabase Recovery Strategies 179
Recovery-To-Current 181
Point-in-Time Recovery 183
Transaction Recovery 184
Choosing the Optimum Recovery Strategy 188
Database Design 189
Reducing the Risk 189
Chapter 22 - Automating eDBA Tasks 191
Intelligent Automation of DBA Tasks by Craig S Mullins 191
Duties of the DBA 192
A Lot of Effort 194
Intelligent Automation 195
Synopsis 196
Chapter 23 - Where to Turn for Help 197
Online Resources of the eDBA by Craig S Mullins 197
Usenet Newsgroups 197
Mailing Lists 200
Websites and Portals 201
No eDBA Is an Island 203
viii The Data Warehousing eBusiness DBA Handbook
Trang 12Conventions Used in this Book
It is critical for any technical publication to follow rigorous standards and employ consistent punctuation conventions to make the text easy to read
However, this is not an easy task Within Oracle there are many types of notation that can confuse a reader Some Oracle utilities such as STATSPACK and TKPROF are always spelled
in CAPITAL letters, while Oracle parameters and procedures have varying naming conventions in the Oracle documentation
It is also important to remember that many Oracle commands are case sensitive, and are always left in their original executable form, and never altered with italics or capitalization
Hence, all Rampant TechPress books follow these conventions:
Parameters - All Oracle parameters will be lowercase italics
Exceptions to this rule are parameter arguments that are commonly capitalized (KEEP pool, TKPROF), these will be left in ALL CAPS
Variables – All PL/SQL program variables and arguments will
also remain in lowercase italics (dbms_job, dbms_utility)
Tables & dictionary objects – All data dictionary objects are
referenced in lowercase italics (dba_indexes, v$sql) This includes all v$ and x$ views (x$kcbcbh, v$parameter) and dictionary views (dba_tables, user_indexes)
SQL – All SQL is formatted for easy use in the code depot,
and all SQL is displayed in lowercase The main SQL terms (select, from, where, group by, order by, having) will always appear on a separate line
Trang 13Programs & Products – All products and programs that are
known to the author are capitalized according to the vendor specifications (IBM, DBXray, etc) All names known by Rampant TechPress to be trademark names appear in this text as initial caps References to UNIX are always made in uppercase
x The Data Warehousing eBusiness DBA Handbook
Trang 14About the Authors
Bill Inmon is universally recognized as the "father of the data
warehouse." He has more than 26 years of database technology management experience and data warehouse design expertise, and has published 36 books and more than
350 articles in major computer journals He is known globally for his seminars on developing data warehouses and has been a keynote speaker for many major computing associations Inmon has consulted with a large number of Fortune 1000 clients, offering data warehouse design and database management services For more information, visit www.BillInmon.com or call (303) 221-4000
Joseph Hudicka is the founder of the Information Architecture
Team, an organization that specializes in data quality, data migration, and ETL Winner of the ODTUG Best Speaker award for the Spring 1999 conference, Joseph is an internationally recognized speaker at ODTUG, OOW, IOUG-A, TDWI and many local user groups Joseph coauthored Oracle8 Design Using UML Object Modeling for Osborne/McGraw-Hill & Oracle Press, and has also written or contributed to several articles for publication in
DMReview, Intelligent Enterprise and The Data Warehousing Institute (TDWI)
Craig S Mullins is a director of technology planning for BMC
Software He has over 15 years of experience dealing with data and database technologies He is the author of the book
DB2 Developer's Guide (now available in a fourth edition that
covers up to and includes the latest release of DB2 -Version 6) and is working on a book about database administration practices (to be published this year by Addison Wesley)
Trang 15Craig can be reached via his Website at
www.craigsmullins.com or at craig_mullins@bmc.com
Fabian Pascal has a national and international reputation as an
independent technology analyst, consultant, author and
lecturer specializing in data management He was affiliated
with Codd & Date and for 20 years held various analytical
and management positions in the private and public sectors,
has taught and lectured at the business and academic levels,
and advised vendor and user organizations on data
management technology, strategy and implementation
Clients include IBM, Census Bureau, CIA, Apple, Borland,
Cognos, UCSF, and IRS He is founder, editor and publisher
of DATABASE DEBUNKINGS (http://www.dbdebunk.com/), a Web site dedicated to
dispelling persistent fallacies, flaws, myths and
misconceptions prevalent in the IT industry (Chris Date is a
senior contributor) Author of three books, he has published
extensively in most trade publications, including DM Review,
Database Programming and Design, DBMS, Byte, Infoworld and
Computerworld He is author of the contrarian columns Against
the Grain, Setting Matters Straight, and for The Journal of
Conceptual Modeling His third book, Practical Issues in Database
MANAGEMENT serves as text for his seminars
xii The Data Warehousing eBusiness DBA Handbook
Trang 16Foreword
With the advent of cheap disk I/O subsystems, it is finally possible for database professionals to have databases store multiple billions and even multiple trillions of bytes of information As the size of these databases increases to behemoth proportions, it is the challenge of the database professionals to understand the correct techniques for loading, maintaining, and extracting information from very large database management systems The advent of cheap disks has also led to an explosion in business technology, where even the most modest financial investment can bring forth an online system with many billions of bytes It is imperative that the business manager understand how to manage and control large volumes of information while at the same time provide the consumer with high-volume throughput and sub-second response time
This book provides you with insight into how to build the foundation of your eBusiness application You’ll learn the importance of the Data Warehouse in your daily operations You’ll gain lots of insight into how to properly design and build your information architecture to handle the rapid growth that eCommerce business sees today Once your system is up and running, it must be maintained There is information in this text that goes through how to maintain online data systems to reduce downtime Keeping your online data secure is another big issue with online business To wrap things up, you’ll get links to some of the best online resources on Data Warehousing
The purpose of this book is to give you significant insights into how you can manage and control large volumes of data As the
Trang 17technology has expanded to support terabyte data capacity, the challenge to the database professionals is to understand effective techniques for the loading and maintaining of these very large database systems This book brings together some of the world's foremost authors on data warehousing in order to provide you with the insights that you need to be successful in your data warehousing endeavors
xiv The Data Warehousing eBusiness DBA Handbook
Trang 181
Data Warehousing
and eBusiness
CHAPTER
Making the Most of E-business
Everywhere you look today, you see e-business In the trade
journals On TV In the Wall Street Journal Everywhere And
the message is that if your business is not e-business enabled, that you will be behind the curve
So what is all the fuss about? Behind the corporate push to get into e-business is a Web site Or multiple Web sites The Web site allows your corporation to have a reach into the marketplace that is direct and far reaching Businesses that would never have entertained entry to foreign marketplaces and other marketplaces that are hard to access suddenly have easy and cheap presence In a word, e-business opens up possibilities that previously were impractical or even impossible
So the secret to e-business is a Web site Right? Well almost Indeed, a Web site is a wonderful delivery mechanism The Web site allows you to go where you might not have ever been able to go before But after all is said and done, a Web site is merely a delivery mechanism To be effective, the delivery mechanism must be allied with application of strong business propositions There is a way of expressing this opportunity = delivery mechanism + business proposition
Trang 19Figure 1: The web site is at the heart of e-Business
To illustrate the limitations of a Web site, consider the personal Web sites that many people have created If there were any inherent business advantage to having a Web site, then these personal sites would be achieving business results for their owners But no one thinks that just putting up a Web site produces results It is what you do with the Web site that counts
To exploit the delivery mechanism that is the Web environment, applications are necessary There are many kinds
of applications that can be adapted to the Web environment But the most potent, most promising applications are a class that are called Customer Relationship Management (CRM) applications CRM applications have the capability of producing very important business results Executed properly, CRM applications:
protect market share
gain new market share
increase revenues
increase profits
2 The Data Warehousing eBusiness DBA Handbook
Trang 20And there's not a business around that doesn't want to do these things
So what kind of applications are we talking about here? There are many different flavors Typical CRM applications include: yield management
credit scoring, and so forth
In short, there are many different ways that applications can be created to absolutely maximize the effectiveness of the Web Stated differently, without these applications, the Web environment is just another Web site
And there are other related non-CRM applications that can improve the bottom line of business as well These applications include:
quality control
profitability analysis
destination analysis (for airlines)
purchasing consolidation, and the like
Trang 21In short, once the Web is enabled by supporting applications, then very real business advantage occurs
But applications do not just happen by themselves Applications such as CRM and others are built on a foundation
of data called a data warehouse The data warehouse is at the center of an infrastructure called the "corporate information factory." Figure 2 shows the corporate information factory and the Web environment
Figure 2: Sitting behind the web site is the infrastructure called the
"corporate information factory"
Figure 2 shows that the Web environment serves as a conduit into the corporate information factory The corporate information factory provides a variety of important functions for the Web environment:
4 The Data Warehousing eBusiness DBA Handbook
Trang 22the corporate information factory enables the Web environment to gather and manage an unlimited amount of data
the corporate information factory creates and environment where sweeping business patterns can be detected and analyzed
the corporate information factory provides a place where Web-based data can be integrated with other corporate data the corporate information factory makes edited and integrated data quickly available to the Web environment, and so forth
In a word, the corporate information factory provides the background infrastructure that turns the Web from a delivery mechanism into a truly powerful tool The different components of the corporate information factory are:
the data warehouse
the corporate ODS
integrated data
historical data
corporate data
A convenient way to think of the data warehouse is as a structure that contain very fine grains of sand Different
Trang 23applications take those grains of sand and reshape them into the form and structure that is most familiar to the organization
One of the issues that frequently arises with applications for the Web is whether it is necessary to have a data warehouse in support of the applications Strictly speaking, it is not necessary
to have a data warehouse in support of the applications that run on the Web Figure 3 shows that different applications have been built from the legacy foundation
Figure 3: Building applications without a data warehouse
6 The Data Warehousing eBusiness DBA Handbook
Trang 24In Figure 3, multiple applications have been built from the same supporting applications Looking at figure 3, it becomes clear that the same processing accessing data, gathering data, editing data, cleansing data, merging data and integrating data are done for every application Almost all of the processing shown is redundant There is no need for every application to repeat what every other application has done Figure 4 shows that by building a data warehouse, the repetitive activities are done just once
Figure 3: Building a data warehouse for the different applications
Trang 25In figure 4, the infrastructure activities of accessing data, gathering data, editing data, cleansing data, merging data and integrating data are done once The savings are obvious But there are some other powerful reasons why building a data warehouse makes sense:
when it comes time to build a new application, with a data warehouse in place the application can be constructed quickly; with no data warehouse in place, the infrastructure has to be built again
if there is a discrepancy in values, with a data warehouse those values can be resolved easily and quickly
the resources required for access of legacy data are minimal when there is a data warehouse; when there is no data warehouse, the resources required for the access of legacy data grow with each new application, and so forth
In short, when an organization takes a long-term perspective, the data warehouse at the center of the corporate information factory is the only way to fly
It is intuitively obvious that a foundation of integrated historical granular data is useful for competitive advantage But one step beyond intuition, the question must be asked exactly how can integrated historical data be turned into competitive advantage It is the purpose of the articles to follow to explain how integrated historical data can be turned into competitive advantage and how that competitive advantage can be delivered through the Web
8 The Data Warehousing eBusiness DBA Handbook
Trang 26The Benefits of Data
Warehousing
CHAPTER
2
The Data Warehouse Foundation
The Web-based e-business environment has tremendous potential The Web is a tremendously powerful medium for delivery of information But there is nothing intrinsically powerful about the Web other than its ability to deliver information In order for the Web-based e-business environment to deliver its full potential, the Web-based environment requires an infrastructure in support of its information processing needs The infrastructure that best supports the Web is called the corporate information factory
At the center of the corporate information factory is a data warehouse
Fig 1 shows the basic infrastructure supporting the Web-based e-business environment
Trang 27Figure 1: the web environment and the supporting infrastructure
The heart of the corporate information factory is the data warehouse The data warehouse is the place where corporate granular integrated historical data resides
The data warehouse serves many functions, but the most important function it serves is that of making information available cheaply and quickly Stated differently, without a data warehouse the cost of information goes sky high and the length
of time required to get information is exceedingly long If the Web-based e-business environment is to be successful, it is necessary to have information that is cheap to access and immediately available
How does the data warehouse lower the cost of getting information? And how does the data warehouse greatly accelerate the speed with which information is available? These
10 The Data Warehousing eBusiness DBA Handbook
Trang 28issues are not immediately obvious when looking at the structure of the corporate information factory
In order to explain how the data warehouse accomplishes its important functions, consider the seemingly innocent request for information in a manufacturing environment where there is
no data warehouse A financial analyst wants to find out what corporate sales were for the last quarter Is this a reasonable request for information? Absolutely Now, what is required to get that information?
Figure 2: getting information from applications
Fig 2 shows that many different sources have to be accessed to get the desired information Some of the data is in IMS; some is
in VSAM Yet other files are in ADABAS The key structure of the European file is different from the key structure of the Asian file The parts data uses different closing dates than the truck data The body design for cars is called one thing in the cars file and another thing in the parts file To get the required information takes lots of analysis, access to 10 programs and the ability to integrate the data Moreover, it takes six months
to deliver the information at a cost of $250,000
Trang 29These numbers are typical for a mid-sized to large corporation
In some cases these numbers are very much understated But the real issue isn't the costs and length of time required for accessing data The real issue is how many resources are needed for accessing many units of information
Fig 3 shows that seven different types of information have been requested
Figure 3: getting information from applications for seven different reports
The costs that were described for Fig 2 now are multiplied by seven (or whatever number of units of data are required) As the analyst is developing the procedures for getting the unit of information required, no thought is given to getting information for other units of information Therefore each
12 The Data Warehousing eBusiness DBA Handbook
Trang 30time a new piece of information is required, the process described in Fig 2 begins all over again AS a result, the cost of information spikes dramatically
But suppose, for example, that this organization had a data warehouse And suppose the organization had a request for seven units of information What would it cost to get that information and how long would it take?
Fig 4 illustrates this scenario
Figure 4: making a report from a data warehouse
Once the data warehouse is built, it can serve multiple requests for information The granular integrated data that resides in the data warehouse is ideal for being shaped and reshaped One analyst can look at the data one way; another analyst can look
at the same data in yet another way And you only have to create the infrastructure once The financial analyst may spend
30 minutes tracking down a unit of data, such as consolidated sales Or if the data is difficult to calculate it may take a day to get the job done Depending on the complexity and how costs are calculated, it may cost from between $100 to $1000 to
Trang 31access the data Compare that price range to what it might cost
at an organization with no data warehouse, and it becomes obvious why a data warehouse makes data available quickly and cheaply
Of course the real difference between having a data warehouse and not having one lies in not having to build the infrastructure required for accessing the data With a data warehouse, you build the infrastructure only once With no data warehouse, you have to build at least part of the infrastructure every time you want new data
In reality, however, no company goes looking for just one piece
of data In fact, it's quite the opposite - most companies require many forms of data And the need for new forms and structures of data is recreated every day When it comes to looking at the larger picture - not the cost of data for a single item, but for the cost of data for all data - the data warehouse greatly eases the burden placed on the information systems organization Fig 5 shows the difference between having a data warehouse and not having a data warehouse in the case of finding multiple types of data
14 The Data Warehousing eBusiness DBA Handbook
Trang 32Figure 5: making seven reports from a data warehouse
Looking at Fig 5, it's obvious that a data warehouse really does lower the cost of getting information and greatly accelerates the rate at which data can be found
But organizations have a habit of not looking at the big picture, preferring instead to focus on immediate needs They look only
up to next Tuesday and not an hour beyond it What do sighted organizations see? The comparison between the data warehouse infrastructure and the need for a single unit of information Fig 6 shows this comparison
Trang 33Figure 6: when all you are looking at is a single report it appears that it
is more expensive to get it from applications directly and not build a data warehouse
When looking at the diagram in Fig 6, the short-term approach
of not building a data warehouse is attractive The organization thinks only of the quick fix And in the very short term, it is less expensive just to dive in and get data from applications without building a data warehouse There are a hundred excuses the corporation has for not looking to the long term: The data warehouse is so big
We heard that data warehouses don't really work
All we need is some quick and dirty information
I don't have time to build a data warehouse
If I build a data warehouse and pay for it, one of my neighbors will use the data later on and they don't have to pay for it, and so forth
As long as a corporation insists on having nothing but a term focus, it will never build a data warehouse But the minute the corporation takes a long-term look, the future becomes an entirely different picture Fig 7 shows the long-term focus
short-16 The Data Warehousing eBusiness DBA Handbook
Trang 34Figure 7: when you look at the larger picture you see that building a data warehouse saves huge amounts of resources
Fig 7 shows that when the long-term needs for information are considered, the data warehouse is far and away the less expensive than the series of short term efforts And the length
of time for access to information is an intangible whose worth
is difficult to measure No one argues that information today, right now is much more effective than information six months from now In fact, six months from now I will have forgotten why I wanted the information in the first place You simply cannot beat a data warehouse for speed and ease of access of information
The Web environment, then, is a most promising environment But in order to unlock the potential of the Web, information must be freely and cheaply available The supporting infrastructure of the data warehouse provides that foundation and is at the heart of the effectiveness of the Web environment
Trang 35References
Inmon, W H - The Corporate Information Factory, 2nd edition,
John Wiley, NY, NY 2000
Inmon, W H - Building the Data Warehouse, 2nd edition, John
Wiley, NY, NY 1998
Inmon, W H - Building the Operational Data Store, 2nd edition,
John Wiley, NY, NY 1999
Inmon, W H - Exploration Warehousing, John Wiley, NY, NY
2000
Website - www.BILLINMON.COM, a site containing useful information about architecture, data models, articles, presentations, white papers, near line storage, exploration warehousing, methodologies and other important topics
18 The Data Warehousing eBusiness DBA Handbook
Trang 36The Value of the Data
Warehouse
CHAPTER
3
The Foundations of E-Business
The basis for a long-term, sound e-business competitive advantage is the data warehouse
Why the Internet?
Consider the Internet When you get down to it, what is the Internet good for? It is good for connectivity, and with connectivity comes opportunity - the opportunity to sell somebody something, to help someone, to get a message across But at the same time, connectivity is ALL the Internet provides In order to take advantage of that connectivity, the real competitive advantage is found in the content and presentation of the messages that are passed along the lines of connectivity
Consider the telephone Before the advent of the telephone, getting a message to someone was accomplished by mail or shouting Then when the telephone appeared, it was possible to have cheap and instant access to someone But merely making
a quick call becomes a trite act The important thing about making a telephone call quickly is what you say to the person, not the fact that you did it cheaply and quickly The message delivered over the phone becomes the essence, not the phone itself
With the phone you can:
Trang 37ask your girlfriend out for Saturday night
tell the county you aren't available for jury duty
call in sick for work and go play golf
find out if it had snowed in Aspen last night
call the doctor, and so forth
The real value of the phone is the communication of the message
The same is true of the Internet Today, people are enamored
of the novelty of the ability to communicate instantaneously But where commercial advantage is concerned, the real value of the Internet lies in the messages that are passed through cyberspace, not in the novelty of the passage itself
Intelligent Messages
To give your messages sent via the Internet some punch, you need intelligence behind them And the basis of that intelligence is the information that is buried in a data warehouse
Why is the data warehouse the basis of business intelligence? Simple With a data warehouse, you have two facets of information that have otherwise not been available: integration and history In years past, application systems have been built
in which each application considered only its own set of requirements One application thought of a customer as one thing, another application thought of a customer as something else There was no integration - no cohesive understanding of information - from one application to the next
20 The Data Warehousing eBusiness DBA Handbook
Trang 38And the applications of yesterday paid no mind to history The applications of yesterday looked only at what was happening right now Ask a bank what your bank account balance is today and they can tell you But ask them what your average balance has been over the past twelve months and they have no idea
Integration, History and Versatility
The essence of data warehousing is integration and history Integration is achieved by the messy task of going back into older legacy systems and pulling out data that was a by-product
of transaction processing, and converting and integrating that data Integrating old legacy data is a dirty, thankless task that nobody wants to undertake, but the rewards of integration are worth the time and effort Historical data is achieved by organizing and collecting the integrated data over time Data is time-stamped and stored at the detailed level
Once an organization has a carefully crafted collection of integrated detailed historical data, it is in a position of great strength The first real value to the collection of data - a data warehouse - is the versatility of the data The data can be organized a certain way on one day and another way the next Marketing can look at customers by state or by month, Sales can look at sales transactions per day, and Accounting can look
at closed business by country or by quarter - all from the same store of data A top manager can walk in at 8:00 am and decide that he or she wants to look at the world in a manner no one else has thought of and the integrated, detailed historical data will allow that to happen Done properly, the manager can have his or her report by 5:00 p.m that same afternoon
So the first tremendous business value that a data warehouse brings is the ability to look at data any way that is useful But
Trang 39looking at data internally doesn't really have anything to do with e-business or the Internet And the data warehouse has tremendous advantages there
How do the Internet and the data warehouse work together to produce a business advantage? The Internet provides connectivity and the data warehouse produces continuity
The Value of Historical Data
Consider the value of historical data when it comes to understanding a customer When you have historical data about customers, you have the key to understanding their future behavior Why? Because people are creatures of habit with predictable life patterns The habits that we form early in our life stick with us throughout our life The clothes we wear, the place we live, the food we eat, the cars we drive, how we pay our bills, how we invest, where we go on vacation - all of these features are set early in our adulthood Understanding a customer's past history then becomes a tremendous predictor
of the future
Customers are subject to patterns In our youth, most of us don't have much money to invest But as we get older, we have more disposable income At mid-life, our children start looking for colleges At late mid-life, we start thinking about retirement
In short, there are predictable patterns of behavior that practically everyone experiences Knowing the history of your customer allows you to predict what the next pattern of behavior will be
What happens when you can predict your customer's behavior? Basically, you're in a position to package products and tailor them to your customers Having historical data that resides in a
22 The Data Warehousing eBusiness DBA Handbook
Trang 40data warehouse lets you do exactly that Through the Internet, you reach the customer Then, the data warehouse tells you what you to say to the customer to get his or her attention The information in the data warehouse allows you to craft a message that your customer wants to hear
Integrated Data
Integrated data has a related but different effect Suppose you are a salesperson wanting to sell something (it really doesn't matter what) Your boss gives you a list and says go to it Here's your list:
Now somebody suggests that you get a little integrated data You don't know exactly what that is, but anything is better than beating your head against a wall So now you have a list of very basic integrated data:
acct 123 - John Smith - male
acct 234 - Mary Jones - female
acct 345 - Tom Watson - male
acct 456 - Chris Ng - female
acct 567 - Pat Wilson - male
acct 678 - Sam Freed - female
This simple integrated data makes your life as a salesperson a littler simpler You know not to sell bras to a male or cigars to a female (or at least not to most females.) Your sales productivity