Data Warehouse Architecture and Models
Trang 1Lesson 4
Data Warehouse Architecture and Models
Trang 2 Explain the features of each type of data by
examining where and why it is used
List the data models that may already exist in a company and describe where they may be
useful to the warehouse model
Explain the two common data warehouse
Trang 3Warehouse Architectures
Enterprise-wide solution
Data mart solution
Combined solution
Trang 4Enterprise Data Warehouse Solution
Trang 5Data Mart Solution
Independent
Use a consistent approach
Avoid disjointed development
Consider the big picture
Dependent
Localization
Subsets of summary data
Oracle Data Mart Suite for a pre-configured solution
Funded departmentally
Independent Dependent
Trang 6Create a Project Team
Staff with experts
Trang 7Identify the Data Requirements
Successful warehouses provide the right information
Analyze business users’ needs
Interview users
Examine data needed
Ascertain data availability
Determine data frequency
Decide the refresh cycle
Trang 8Types of Warehouse Data
Fact data - Measures
Dimension data - Query drivers
Summary data - Pre-calculated data
Trang 9Fact Data and Tables
Many fact tables in the warehouse
The bulk of the warehouse data
Measures (units sold, sales figures, calls)
Millions of rows
Multi-part primary keys
Summaries
Normalized data
Trang 12Vertical Partitioning
Vertical partitioning by column
Col A Col B Col C Col D
• Oracle servers support both types of partitioning
Trang 13Dimension Data
Query path
Customer Location
Product Time
Sales
• Provides query criteria
• Links to fact tables with keys
Trang 14Dimension Tables
Are determined by user requirements
Vary in number
May contain hierarchies
May share fact tables
Trang 15Time Dimension
Important required dimension
Contains special dates
Provides flexible and accurate analysis by time
Customer Office Office
Product Time
Trang 17Summary Tables
Provide immediate answers to a query
Improve query performance
Requirements change and need managing
SALES FACTS Sales$ Region Month
Jan 97 East 10,000 Feb 97 South 40,000 Mar 97 West 17,000
SALES BY MONTH Month Tot_Sales Jan 97 51,000 Feb 97 40,000 Mar 97 17,000
Trang 18 Define the model
Use graphical modeling tools
Use a tool capable of prototyping
Proof of concept essential
New design techniques
Trang 19The Enterprise Model
Defines an overall scope
A good start point for analysis
An information framework for the warehouse
Trang 20The Corporate Data Model
Current operational data structures
Source of mapping rules for the warehouse
Marketing
Trang 21A Typical Modeling Approach
1 Analyze the subject area.
2 Gather the requirements.
3 Develop the models.
4 Map the entities, attributes, and
relationships.
5 Re-engineer the source data.
6 Design the database.
7 Integrate the model into the
warehouse architecture
repository.
8 Review with client and revise.
iterative approach.
Trang 22Sales Fact Table Item_id
Store_id Sales_dollars Sales_units
Store Table Store_id District_id
Store Table Store_id District_id
Item Table
Item_id
Item Table
Item_id Dept_id
Time Table Week_id
Time Table Week_id Period_id
Totsales Summary Month_id
Store_id Item_id Total_dollars
Totsales Summary Month_id
Store_id Item_id Total_dollars
Trang 23Snowflake Model
Sales Fact Table Item_id
Store_id Sales_dollars Sales_units
Sales Fact Table Item_id
Store_id Sales_dollars Sales_units
Store Table Store_id Store_desc District_id
Store Table Store_id Store_desc District_id
Item Table Item_id Item_desc Dept_id
Item Table Item_id Item_desc Dept_id
Totsales Month_id Store_id Item_id Total_dollars
District Table District_id District_desc
District Table District_id District_desc
Dept Table Dept_id Dept_desc
Dept Table Dept_id Dept_desc
Mgr Table Dept_id Mgr_id
Mgr Table Dept_id Mgr_id
Trang 24Snowflake Model
A normalized star schema
Easier to model requirements
Flexible dimension structures
Readily maps to existing data
Used directly by tools
Database servers
Star queries
Star joins
VLDB support
Trang 25Which Model Do you Use?
Design for simplicity
Design for relevance
Simple star model
Flexible snowflake model