Tài liệu Module 10: Managing Partitions doc

Creating Partitions Explain each step involved in creating a partition—choosing the fact table, defining a data slice, assigning the partition location, and completing the partition.. Pa

Trang 1

Contents

Overview 1

Lab A: Creating a Partition in the Sales Cube15

Trang 2

purpose, without the express written permission of Microsoft Corporation If, however, your only means of access is electronic, permission to print one copy is hereby granted

Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property

Microsoft, BackOffice, MS-DOS, Windows, Windows NT, <plus other appropriate product names or titles Replace this example list with list of trademarks provided by copy editor Microsoft is listed first, followed by all other Microsoft trademarks in alphabetical order > are either registered trademarks or trademarks of Microsoft Corporation in the U.S.A and/or other countries

<This is where mention of specific, contractually obligated to, third party trademarks, which are added by the Copy Editor>

The names of companies, products, people, characters, and/or data mentioned herein are fictitious and are in no way intended to represent any real individual, company, product, or event, unless otherwise noted

Other product and company names mentioned herein may be the trademarks of their respective owners

Trang 3

BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY

Instructor Notes

For enterprise-scale online analytical processing (OLAP) cubes developed in Microsoft® SQL Server™ 2000 Analysis Services, partitioning can improve both processing and query performance In this module, students learn how to create partitions, how to define slices and filters, and the benefits of using partitions in cubes to improve scalability

After completing this module, students will be able to:

! Explain the benefits of partitioning

! Describe the mechanics of the Partition Wizard

! Explain when to define slices and when to define filters

! Describe the purpose and mechanics of merging partitions

Materials and Preparation

This section lists the required materials and preparation tasks that you need to teach this module

Required Materials

To teach this module, you need the following materials:

! Microsoft PowerPoint® file 2074A_10.ppt

Preparation Tasks

To prepare for this module, you should:

! Read all the student materials

! Read the instructor notes and margin notes

! Practice the lecture presentation and demonstration

! Complete the labs

! Review the Trainer Preparation presentation for this module on the Trainer Materials compact disc

! Review any relevant white papers that are located on the Trainer Materials compact disc

Presentation:

30 Minutes

Labs:

30 Minutes

Trang 4

Other Activities

Difficult Questions

Below are difficult questions that students may ask you during the delivery of this module and answers to the questions These materials delve into subjects that are within the scope of the module but are not specifically addressed in the content of the student notes

1 After defining multiple partitions, there are empty spaces in the cube even though there is data in the fact tables to support the cells What causes this?

Incomplete partitions can cause missing data Incomplete partitions result when a partition is misdefined (perhaps using a member from a level that is too low for a slice) or not defined at all Be careful—if a partition is misdefined and another partition is added to fix the problem, duplicate data can result

2 What is the best way to split a partition into one or more different partitions?

There is no direct way to do this To split up an existing partition, you must redefine the partition on a smaller slice or modify the WHERE clause and then define new partitions

3 If a different fact table than the one defined for the partition is used for an incremental load of the partition, what happens when the cube is refreshed

or processed in the future?

The incremental data will not be included Analysis Services refers to only one fact table per cube partition

Trang 5

Module Strategy

Use the following strategy to present this module:

! Introducing Partitions Start with an explanation of why partitions are used Emphasize the fact that partitions are transparent—users and front-end applications see only cubes Explain that partitions in a cube may have different storage modes,

aggregation designs, and physical locations Define remote partitions

! Creating Partitions Explain each step involved in creating a partition—choosing the fact table, defining a data slice, assigning the partition location, and completing the partition Finish by describing how to access commands in Analysis Manager

! Using Advanced Settings

Introduce students to the Advanced settings dialog box Describe the

settings available—specifying filters, enabling drillthrough options, and setting the aggregation prefix—and explain when to use each

! Merging Partitions Explain to students why merging partitions can be beneficial Use the Current Year/Prior Year/History partition example Describe the steps involved in merging partitions Emphasize the fact that, to be merged, two partitions must have the same storage mode and aggregation design

Trang 7

After completing this module, you will be able to:

! Explain the benefits of partitioning

! Describe the mechanics of the Partition Wizard

! Explain when to define slices and when to define filters

! Describe the purpose and mechanics of merging partitions

In this module, you will learn

aboutpartitions and their

use in OLAP cubes

Trang 8

Topic Objective

To introduce the concept of

partitions

Lead-in

In this section, you will learn

about partition architecture

and design and the use of

partitions in OLAP cubes

Trang 9

Basic Architecture

! Act As Physical Storage Mediums for Cube Data

! Improve Cube Performance

$ Cube processing performance

$ Query performance

! Are Transparent to Users

! Require SQL Server 2000 Enterprise Edition

$ Analysis Server and Analysis Manager Computers

Partitions are the physical storage mediums for cube data A cube may have one

or more partitions Each partition may have a different storage mode with a different aggregation design In addition, each partition may be located on a different server All cubes initially have a single default partition

When you design aggregations for a one-partition cube, you are actually designing aggregations for the partition, not for the cube When you process a single-partition cube, you are also processing the partition, not the cube If a cube contains more than one partition, attempting to design storage for a cube opens up a dialog box that requires you to select a single partition for designing aggregations

You create partitions to improve cube processing and query performance, increasing the scalability of a cube Partitions are processed either as part of a full cube process or independent from other cube partitions By processing partitions independently, you isolate processing to a subset of cube data, and therefore reduce the processing time In addition, queries can focus on a single partition and can perform faster data retrievals due to the smaller data set being accessed

Partitions are transparent Users and application front-ends see only cubes—that

is, they query a cube and not a partition The cube reflects the combined data contained in all its partitions

To create multiple partitions in a cube, you must have the Enterprise Edition of SQL Server 2000 installed on the Analysis Server and on any computers administering the server To install the Enterprise Edition of SQL Server 2000, the computer requires one of the following operating systems:

! Microsoft Windows NT® Server 4.0 with Service Pack 5

! Windows NT Server Enterprise Edition 4.0 with Service Pack 5

! Microsoft Windows 2000 Advanced Server

! Windows 2000 Data Center Server

Partitions are the physical

storage mediums for cube

data

Trang 10

Partitioning Design

History Prior Year

MOLAP 35% agg

ROLAP 0% agg

Current Year

MOLAP 10% agg

Partitions separate cube data into discrete storage areas Each partition in a cube may have different:

! Storage modes—multidimensional OLAP (MOLAP), relational OLAP (ROLAP), or hybrid OLAP (HOLAP)

! Source fact tables—Salesfact2000 for one partition and Salesfact2001 for

another

! Aggregation designs—35 percent aggregation in one MOLAP partition versus 10 percent aggregation in a second MOLAP partition versus 0 percent aggregation in a third ROLAP partition

! Storage locations—Server 1 in Pittsburgh, Server 2 in San Francisco, and so forth

Because of these factors, partitioning is the principal feature in Analysis Services for increasing cube scalability and designing storage to reflect user access and response time needs

In the preceding illustration, three partitions exist in one cube The data in the partitions, representing the entire accounting history for a company, is organized by current year, prior year, and historical data

The aggregation design for each of the partitions reflects specific user access needs and storage considerations The partitions are designed as follows:

! Current Year The current year partition design contains the highest aggregation percentage using the MOLAP storage mode—the fastest storage method The design reflects the high number of users, the high frequency of the access of each user, and the presumed performance requirements for the reporting and analysis of current year data

Topic Objective

To describe the use of

different aggregation and

storage designs in

partitions

Lead-in

Partitions separate cube

data into discrete storage

areas Each partition in a

cube may have different

storage modes, aggregation

designs, and physical

locations

Delivery Tip

Point out that agg stands for

aggregation in the preceding

illustration

Trang 11

To further enhance response times, the current year partition might also reside on a more robust server with large amounts of storage and memory and a fast processor

! Prior Year The prior year partition design also uses MOLAP storage, but with a lower aggregation level, thus conserving storage while sacrificing reporting performance Users access prior year data less frequently than they access current year data, and data extraction is typically used for printed reports rather than ad hoc analysis, so response times are less important

! History The history design uses ROLAP with zero aggregations, and is the slowest

of the three partitions in reporting performance—reflecting the low frequency of access by users for this type of data While reporting is slow, ROLAP with no aggregations is the most storage efficient design—

reflecting the need for economical storage of large amounts of historical data

In summary, partitioning is an important tool for enhancing overall system performance Partitioning allows you to balance data load performance, run-time reporting performance, and storage needs against user needs, production cycle times, and hardware availability

Note

Trang 12

Remote Partitions

! Remote Partitions Are Stored on a Separate Server

$ Data stored separately

$ Processing performed separately

$ Querying performed separately, but funneled through local Analysis Server

$ Metadata stored and maintained on local Analysis Server

$ Administration performed on the local Analysis Server

! Transparent Setup and Maintenance

A partition assigned to a server that is physically separate from the main

Analysis Server is called a remote partition The separate server where the remote partition is stored is called the remote server, and the main server where the cube definition is stored and administered is called the local server

The following are the general parameters for the organization and processing of remote partitions:

! The data associated with the remote partition is stored on the separate, or remote, server

! All processing of the partition by its own aggregation rules is done on the remote server

! Querying of the partition occurs on the remote server, but is funneled from the local Analysis Server

! Metadata for the partition is stored and maintained on the local Analysis server—not on the remote server

! Administration of a cube and its associated partitions is performed on the local Analysis Server, and not on the remote server as part of the remote partition definition In other words, no administration occurs on the remote server

You create remote partitions through the Partition Wizard In the process of creating a partition by using the wizard, after you name the remote server for a remote partition, all other configurations are automatic and transparent—that is, setup and administration is the same as for a non-remote partition

Topic Objective

To introduce the concept of

remote partitions

Lead-in

You have the ability to

define cube partitions as

remote in Analysis Services

Trang 13

# Creating Partitions

! Choosing the Fact Table

! Defining a Data Slice

! Assigning the Partition Location

! Completing the Partition

! Accessing Commands

When you first create a cube, by default the cube contains one partition You add additional partitions by using the Partition Wizard This section outlines the systematic procedure for creating additional partitions, which includes the following steps:

! Choosing the data source and the source fact table, if different from the default source and fact table

! Optionally defining a data slice to focus the data included in the partition

! Assigning the location of the partition

! Completing the partition design by naming it and specifying an aggregation design

! Accessing partitioning commands to administer a partition after you have created it

Topic Objective

To introduce the mechanics

of creating partitions

Lead-in

This section outlines the

systematic procedures for

demonstration Switch back

and forth between the slides

and the Partition Wizard,

showing students the actual

interfaces as you discuss

the issues

Trang 14

Choosing the Fact Table

When you create a new partition in a cube, the partition fact table is not required to be from the same data source or fact table defined in the Cube Editor The first step in creating a partition is to choose a data source and fact table for the partition The fact table must contain the measures and dimension keys found in the fact table defined in the Cube Editor

To add a new partition to a cube, perform the following steps:

1 In Analysis Manager, expand the folder for the cube to which you want to add a partition

2 Right-click the Partitions folder, and then click New Partition

The Partition Wizard opens

3 Click Next to bypass the Welcome step of the Partition Wizard

Topic Objective

To describe the action of

choosing a fact table for a

new partition

Lead-in

The first step in creating a

partition is choosing a data

source and fact table for the

partition

Trang 15

To confirm or change the fact table and database, perform the following steps:

1 From the Specify data source and the fact table step, note the data source

and fact table default entries

The Partition Wizard defaults to the same data source and fact table that are defined in the Cube Editor

2 Click Change to specify a different fact table

If you want to use the same data source and fact table as the default

partition, click Next to proceed to the Select the data slice (optional) step

3 From the Choose a fact table step, select a fact table for the partition to use from the Tables list, and then click OK

The Tables list includes all the fact tables associated with available data

sources The chosen fact table must have the same structure as the fact table

of the default partition If it does not, Analysis Manager will display an alert and will not allow you to proceed with the invalid choice

If you want to define a different data source, in the Choose a fact table step, click New Data Source

4 Click Next

Trang 16

Defining a Data Slice

The next step in designing a partition is determining the partition’s data slice You define a data slice for a partition to define which data to include in the partition In addition, queries use data slices to determine which partitions to access when retrieving data Use data slices to prevent duplication of data and

to optimize query performance

Choosing the data slice on which to base a partition is an important design decision, which must take into consideration user reporting and analysis needs, load and processing cycle times, and server hardware availability

While the step of defining a data slice is optional in the Partition Wizard, it is important to specify a data slice if the partition derives from the same data source and fact table as the default partition If you do not otherwise specify a data slice—or a filter, which is discussed in the next section—then the default partition and the new partition will contain duplicate data, which defeats the purpose of creating the additional partition

If the partitions are derived from different fact tables, and the fact tables are partitioned the same as the cube partitions, specifying a data slice is not necessary However, if you define a data slice, queries accessing partition data

do not waste time by searching through partitions that do not contain the requested data

Topic Objective

To describe the process of

defining a data slice in a

cube partition

Lead-in

The next step in designing a

partition is determining the

partition’s data slice

Delivery Tip

Point out that defining a

data slice for a partition

ensures that queries

achieve the full benefit of

multiple partitions

Trang 17

To define a data slice, perform the following steps:

1 In the Select the data slice (optional) step, select a dimension for the data slice from the Dimensions list

A hierarchical list of members for the selected dimension appears in the

Members list You can drill down through this list to see various members

in the hierarchy

2 In the Members list, click a member to define the specific data slice

The member selected appears under the Data slice column of the Dimensions list to the right of the selected dimension Note in the interface

that the data slice maps to a single member definition You define a data slice that requires additional complexity by using a filter expression, which

is reviewed later in this module

3 Click Next

Trang 18

Assigning the Partition Location

Each partition can reside on a different server The wizard allows you to specify whether the partition should remain on the local server or be distributed to another server

If you want to define a remote partition for a cube, you must define the remote partition on a computer running Analysis Services In addition, you must have a user name in the OLAP Administrator group on both computers Lastly, the logon account for the Analysis Server service must be a domain user account before creating a remote partition

Topic Objective

To describe how you can

specify the location for

storing the partition

Lead-in

The wizard allows you to

specify whether the partition

should remain on the local

server or be distributed to

another server

Note

Tiêu đề	Managing Partitions
Trường học	Microsoft Corporation
Chuyên ngành	Online Analytical Processing (OLAP)
Thể loại	module
Năm xuất bản	2000
Thành phố	Redmond

Định dạng
Số trang	36
Dung lượng	863,12 KB