Creating Partitions Explain each step involved in creating a partition—choosing the fact table, defining a data slice, assigning the partition location, and completing the partition.. Pa
Trang 1Contents
Overview 1
Lab A: Creating a Partition in the Sales Cube15
Trang 2purpose, without the express written permission of Microsoft Corporation If, however, your only means of access is electronic, permission to print one copy is hereby granted
Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property
2000 Microsoft Corporation All rights reserved
Microsoft, BackOffice, MS-DOS, Windows, Windows NT, <plus other appropriate product names or titles Replace this example list with list of trademarks provided by copy editor Microsoft is listed first, followed by all other Microsoft trademarks in alphabetical order > are either registered trademarks or trademarks of Microsoft Corporation in the U.S.A and/or other countries
<This is where mention of specific, contractually obligated to, third party trademarks, which are added by the Copy Editor>
The names of companies, products, people, characters, and/or data mentioned herein are fictitious and are in no way intended to represent any real individual, company, product, or event, unless otherwise noted
Other product and company names mentioned herein may be the trademarks of their respective owners
Trang 3BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY
Instructor Notes
For enterprise-scale online analytical processing (OLAP) cubes developed in Microsoft® SQL Server™ 2000 Analysis Services, partitioning can improve both processing and query performance In this module, students learn how to create partitions, how to define slices and filters, and the benefits of using partitions in cubes to improve scalability
After completing this module, students will be able to:
! Explain the benefits of partitioning
! Describe the mechanics of the Partition Wizard
! Explain when to define slices and when to define filters
! Describe the purpose and mechanics of merging partitions
Materials and Preparation
This section lists the required materials and preparation tasks that you need to teach this module
Required Materials
To teach this module, you need the following materials:
! Microsoft PowerPoint® file 2074A_10.ppt
Preparation Tasks
To prepare for this module, you should:
! Read all the student materials
! Read the instructor notes and margin notes
! Practice the lecture presentation and demonstration
! Complete the labs
! Review the Trainer Preparation presentation for this module on the Trainer Materials compact disc
! Review any relevant white papers that are located on the Trainer Materials compact disc
Presentation:
30 Minutes
Labs:
30 Minutes
Trang 4Other Activities
Difficult Questions
Below are difficult questions that students may ask you during the delivery of this module and answers to the questions These materials delve into subjects that are within the scope of the module but are not specifically addressed in the content of the student notes
1 After defining multiple partitions, there are empty spaces in the cube even though there is data in the fact tables to support the cells What causes this?
Incomplete partitions can cause missing data Incomplete partitions result when a partition is misdefined (perhaps using a member from a level that is too low for a slice) or not defined at all Be careful—if a partition is misdefined and another partition is added to fix the problem, duplicate data can result
2 What is the best way to split a partition into one or more different partitions?
There is no direct way to do this To split up an existing partition, you must redefine the partition on a smaller slice or modify the WHERE clause and then define new partitions
3 If a different fact table than the one defined for the partition is used for an incremental load of the partition, what happens when the cube is refreshed
or processed in the future?
The incremental data will not be included Analysis Services refers to only one fact table per cube partition
Trang 5BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY
Module Strategy
Use the following strategy to present this module:
! Introducing Partitions Start with an explanation of why partitions are used Emphasize the fact that partitions are transparent—users and front-end applications see only cubes Explain that partitions in a cube may have different storage modes,
aggregation designs, and physical locations Define remote partitions
! Creating Partitions Explain each step involved in creating a partition—choosing the fact table, defining a data slice, assigning the partition location, and completing the partition Finish by describing how to access commands in Analysis Manager
! Using Advanced Settings
Introduce students to the Advanced settings dialog box Describe the
settings available—specifying filters, enabling drillthrough options, and setting the aggregation prefix—and explain when to use each
! Merging Partitions Explain to students why merging partitions can be beneficial Use the Current Year/Prior Year/History partition example Describe the steps involved in merging partitions Emphasize the fact that, to be merged, two partitions must have the same storage mode and aggregation design
Trang 7BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY
After completing this module, you will be able to:
! Explain the benefits of partitioning
! Describe the mechanics of the Partition Wizard
! Explain when to define slices and when to define filters
! Describe the purpose and mechanics of merging partitions
In this module, you will learn
aboutpartitions and their
use in OLAP cubes
Trang 8Topic Objective
To introduce the concept of
partitions
Lead-in
In this section, you will learn
about partition architecture
and design and the use of
partitions in OLAP cubes
Trang 9BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY
Basic Architecture
! Act As Physical Storage Mediums for Cube Data
! Improve Cube Performance
$ Cube processing performance
$ Query performance
! Are Transparent to Users
! Require SQL Server 2000 Enterprise Edition
$ Analysis Server and Analysis Manager Computers
Partitions are the physical storage mediums for cube data A cube may have one
or more partitions Each partition may have a different storage mode with a different aggregation design In addition, each partition may be located on a different server All cubes initially have a single default partition
When you design aggregations for a one-partition cube, you are actually designing aggregations for the partition, not for the cube When you process a single-partition cube, you are also processing the partition, not the cube If a cube contains more than one partition, attempting to design storage for a cube opens up a dialog box that requires you to select a single partition for designing aggregations
You create partitions to improve cube processing and query performance, increasing the scalability of a cube Partitions are processed either as part of a full cube process or independent from other cube partitions By processing partitions independently, you isolate processing to a subset of cube data, and therefore reduce the processing time In addition, queries can focus on a single partition and can perform faster data retrievals due to the smaller data set being accessed
Partitions are transparent Users and application front-ends see only cubes—that
is, they query a cube and not a partition The cube reflects the combined data contained in all its partitions
To create multiple partitions in a cube, you must have the Enterprise Edition of SQL Server 2000 installed on the Analysis Server and on any computers administering the server To install the Enterprise Edition of SQL Server 2000, the computer requires one of the following operating systems:
! Microsoft Windows NT® Server 4.0 with Service Pack 5
! Windows NT Server Enterprise Edition 4.0 with Service Pack 5
! Microsoft Windows 2000 Advanced Server
! Windows 2000 Data Center Server
Partitions are the physical
storage mediums for cube
data
Trang 10Partitioning Design
History Prior Year
MOLAP 35% agg
ROLAP 0% agg
Current Year
MOLAP 10% agg
Partitions separate cube data into discrete storage areas Each partition in a cube may have different:
! Storage modes—multidimensional OLAP (MOLAP), relational OLAP (ROLAP), or hybrid OLAP (HOLAP)
! Source fact tables—Salesfact2000 for one partition and Salesfact2001 for
another
! Aggregation designs—35 percent aggregation in one MOLAP partition versus 10 percent aggregation in a second MOLAP partition versus 0 percent aggregation in a third ROLAP partition
! Storage locations—Server 1 in Pittsburgh, Server 2 in San Francisco, and so forth
Because of these factors, partitioning is the principal feature in Analysis Services for increasing cube scalability and designing storage to reflect user access and response time needs
In the preceding illustration, three partitions exist in one cube The data in the partitions, representing the entire accounting history for a company, is organized by current year, prior year, and historical data
The aggregation design for each of the partitions reflects specific user access needs and storage considerations The partitions are designed as follows:
! Current Year The current year partition design contains the highest aggregation percentage using the MOLAP storage mode—the fastest storage method The design reflects the high number of users, the high frequency of the access of each user, and the presumed performance requirements for the reporting and analysis of current year data
Topic Objective
To describe the use of
different aggregation and
storage designs in
partitions
Lead-in
Partitions separate cube
data into discrete storage
areas Each partition in a
cube may have different
storage modes, aggregation
designs, and physical
locations
Delivery Tip
Point out that agg stands for
aggregation in the preceding
illustration
Trang 11BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY
To further enhance response times, the current year partition might also reside on a more robust server with large amounts of storage and memory and a fast processor
! Prior Year The prior year partition design also uses MOLAP storage, but with a lower aggregation level, thus conserving storage while sacrificing reporting performance Users access prior year data less frequently than they access current year data, and data extraction is typically used for printed reports rather than ad hoc analysis, so response times are less important
! History The history design uses ROLAP with zero aggregations, and is the slowest
of the three partitions in reporting performance—reflecting the low frequency of access by users for this type of data While reporting is slow, ROLAP with no aggregations is the most storage efficient design—
reflecting the need for economical storage of large amounts of historical data
In summary, partitioning is an important tool for enhancing overall system performance Partitioning allows you to balance data load performance, run-time reporting performance, and storage needs against user needs, production cycle times, and hardware availability
Note
Trang 12Remote Partitions
! Remote Partitions Are Stored on a Separate Server
$ Data stored separately
$ Processing performed separately
$ Querying performed separately, but funneled through local Analysis Server
$ Metadata stored and maintained on local Analysis Server
$ Administration performed on the local Analysis Server
! Transparent Setup and Maintenance
A partition assigned to a server that is physically separate from the main
Analysis Server is called a remote partition The separate server where the remote partition is stored is called the remote server, and the main server where the cube definition is stored and administered is called the local server
The following are the general parameters for the organization and processing of remote partitions:
! The data associated with the remote partition is stored on the separate, or remote, server
! All processing of the partition by its own aggregation rules is done on the remote server
! Querying of the partition occurs on the remote server, but is funneled from the local Analysis Server
! Metadata for the partition is stored and maintained on the local Analysis server—not on the remote server
! Administration of a cube and its associated partitions is performed on the local Analysis Server, and not on the remote server as part of the remote partition definition In other words, no administration occurs on the remote server
You create remote partitions through the Partition Wizard In the process of creating a partition by using the wizard, after you name the remote server for a remote partition, all other configurations are automatic and transparent—that is, setup and administration is the same as for a non-remote partition
Topic Objective
To introduce the concept of
remote partitions
Lead-in
You have the ability to
define cube partitions as
remote in Analysis Services
Trang 13BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY
# Creating Partitions
! Choosing the Fact Table
! Defining a Data Slice
! Assigning the Partition Location
! Completing the Partition
! Accessing Commands
When you first create a cube, by default the cube contains one partition You add additional partitions by using the Partition Wizard This section outlines the systematic procedure for creating additional partitions, which includes the following steps:
! Choosing the data source and the source fact table, if different from the default source and fact table
! Optionally defining a data slice to focus the data included in the partition
! Assigning the location of the partition
! Completing the partition design by naming it and specifying an aggregation design
! Accessing partitioning commands to administer a partition after you have created it
Topic Objective
To introduce the mechanics
of creating partitions
Lead-in
This section outlines the
systematic procedures for
demonstration Switch back
and forth between the slides
and the Partition Wizard,
showing students the actual
interfaces as you discuss
the issues
Trang 14Choosing the Fact Table
When you create a new partition in a cube, the partition fact table is not required to be from the same data source or fact table defined in the Cube Editor The first step in creating a partition is to choose a data source and fact table for the partition The fact table must contain the measures and dimension keys found in the fact table defined in the Cube Editor
To add a new partition to a cube, perform the following steps:
1 In Analysis Manager, expand the folder for the cube to which you want to add a partition
2 Right-click the Partitions folder, and then click New Partition
The Partition Wizard opens
3 Click Next to bypass the Welcome step of the Partition Wizard
Topic Objective
To describe the action of
choosing a fact table for a
new partition
Lead-in
The first step in creating a
partition is choosing a data
source and fact table for the
partition
Trang 15BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY
To confirm or change the fact table and database, perform the following steps:
1 From the Specify data source and the fact table step, note the data source
and fact table default entries
The Partition Wizard defaults to the same data source and fact table that are defined in the Cube Editor
2 Click Change to specify a different fact table
If you want to use the same data source and fact table as the default
partition, click Next to proceed to the Select the data slice (optional) step
3 From the Choose a fact table step, select a fact table for the partition to use from the Tables list, and then click OK
The Tables list includes all the fact tables associated with available data
sources The chosen fact table must have the same structure as the fact table
of the default partition If it does not, Analysis Manager will display an alert and will not allow you to proceed with the invalid choice
If you want to define a different data source, in the Choose a fact table step, click New Data Source
4 Click Next
Trang 16Defining a Data Slice
The next step in designing a partition is determining the partition’s data slice You define a data slice for a partition to define which data to include in the partition In addition, queries use data slices to determine which partitions to access when retrieving data Use data slices to prevent duplication of data and
to optimize query performance
Choosing the data slice on which to base a partition is an important design decision, which must take into consideration user reporting and analysis needs, load and processing cycle times, and server hardware availability
While the step of defining a data slice is optional in the Partition Wizard, it is important to specify a data slice if the partition derives from the same data source and fact table as the default partition If you do not otherwise specify a data slice—or a filter, which is discussed in the next section—then the default partition and the new partition will contain duplicate data, which defeats the purpose of creating the additional partition
If the partitions are derived from different fact tables, and the fact tables are partitioned the same as the cube partitions, specifying a data slice is not necessary However, if you define a data slice, queries accessing partition data
do not waste time by searching through partitions that do not contain the requested data
Topic Objective
To describe the process of
defining a data slice in a
cube partition
Lead-in
The next step in designing a
partition is determining the
partition’s data slice
Delivery Tip
Point out that defining a
data slice for a partition
ensures that queries
achieve the full benefit of
multiple partitions
Trang 17BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY
To define a data slice, perform the following steps:
1 In the Select the data slice (optional) step, select a dimension for the data slice from the Dimensions list
A hierarchical list of members for the selected dimension appears in the
Members list You can drill down through this list to see various members
in the hierarchy
2 In the Members list, click a member to define the specific data slice
The member selected appears under the Data slice column of the Dimensions list to the right of the selected dimension Note in the interface
that the data slice maps to a single member definition You define a data slice that requires additional complexity by using a filter expression, which
is reviewed later in this module
3 Click Next
Trang 18Assigning the Partition Location
Each partition can reside on a different server The wizard allows you to specify whether the partition should remain on the local server or be distributed to another server
If you want to define a remote partition for a cube, you must define the remote partition on a computer running Analysis Services In addition, you must have a user name in the OLAP Administrator group on both computers Lastly, the logon account for the Analysis Server service must be a domain user account before creating a remote partition
Topic Objective
To describe how you can
specify the location for
storing the partition
Lead-in
The wizard allows you to
specify whether the partition
should remain on the local
server or be distributed to
another server
Note