Hướng dẫn học Microsoft SQL Server 2008 part 149 pptx

The first example partition scheme, namedpsYearsAll, uses thepfYearsRTpartition function andplaces all the partitions in thePrimaryfilegroup: CREATE PARTITION SCHEME psYearsAll AS PARTIT

Trang 1

FIGURE 68-3

The partition function is used by the partition scheme to place the data in separate filegroups

Part01

Boundaries

defined by

the Partition

Function

Partition

Locations

defined by

the Partition

Scheme

1/1/2002 1/1/2003 1/1/2004 1/1/2005

Part02 Part03

Create Table( ) On Partition Scheme

Part04 Part05

only specify the boundary values between ranges; they don’t define the upper or lower values for the

whole table

A boundary value can only exist in one partition The ranges are defined as left or right If a row has a

partition column value that is the same as a boundary value, then SQL Server needs to know in which

partition to put the row

Left ranges mean that data equal to the boundary is included in the partition to the left of the

bound-ary A boundary of‘12/31/2004’would create two partitions The lower partition would include all

data up to and including‘12/31/2004’, and the right partition would include any data greater than

‘12/31/2004’

Right ranges mean that data equal to the boundary goes into the partition on the right of the boundary

value To separate at the new year starting 2008, a right range would set the boundary at‘1/1/2008’

Any values less than the boundary go into the left, or lower, boundary Any data with a date equal to

or later than the boundary goes into the next partition These two functions use left and right ranges to

create the same result:

CREATE PARTITION FUNCTION pfyears(DateTime)

AS RANGE LEFT FOR VALUES (’12/31/2001’, ‘12/31/2002’, ‘12/31/2003’, ‘12/31/2004’);

or

CREATE PARTITION FUNCTION pfYearsRT(DateTime)

AS RANGE RIGHT FOR VALUES (‘1/1/2002’, ‘1/1/2003’, ‘1/1/2004’, ‘1/1/2008’);

These functions both create four defined boundaries, and thus five partitions

Trang 2

SQL Server 2008’s table partitions are declarative, meaning the table is segmented by data

values A hash partition segments the data randomly SQL Server does not have hash

par-titioning You can create a hash function on a computed column but your client application needs to

understand this computation to allow for partition elimination Another option to randomly spread the

data across multiple disk subsystems is to define the table using a filegroup and then add multiple files

to the filegroup See Figure 68-4.

FIGURE 68-4

The partition configuration can be viewed in Object Explorer

Three catalog views expose information about partition function: syspartition_

functions , syspartition_function_range_values , and syspartition_parameters

Creating partition schemes

The partition schema builds on the partition function to specify the physical locations for the partitions

The physical partition tables may all be located in the same filegroup or spread over several filegroups

Trang 3

The first example partition scheme, namedpsYearsAll, uses thepfYearsRTpartition function and

places all the partitions in thePrimaryfilegroup:

CREATE PARTITION SCHEME psYearsAll

AS PARTITION pfYearsRT ALL TO ([Primary]);

To place the table partitions in their own filegroup, omit theALLkeyword and list the filegroups

individually This creates five partitions to match the four boundary values specific in the function:

CREATE PARTITION SCHEME psYearsFiles

AS PARTITION pfYearsRT

TO (Part01, Part02, Part03, Part04, Part05);

The partition functions and schemes must be created using T-SQL code, but once they’ve been created

you can view them in Management Studio’s Object Explorer under the database Storage node

To examine information about partition schemes programmatically, query sys.partition_

schemes

Creating the partition table

Once the partition function and partition schemes are in place, actually creating the table is a piece of

cake (pun intended) I recommend creating a partition table with a non-clustered primary key Adding a

clustered index to a table will partition the table based on the partition scheme TheWorkOrderTable

Properties page also displays the partition scheme being used by the table

Partition functions and partition schemes don’t have owners, so when referring to partition schemes or

partition functions, you don’t need to use the four-part name or the schema owner in the name

The following table is similar to theAdventureWorks WorkOrdertable in the production scheme:

CREATE TABLE dbo.WorkOrder ( WorkOrderID INT NOT NULL PRIMARY KEY NONCLUSTERED, ProductID INT NOT NULL,

OrderQty INT NOT NULL, StockedQty INT NOT NULL, ScrappedQty INT NOT NULL, StartDate DATETIME NOT NULL, EndDate DATETIME NOT NULL, DueDate DATETIME NOT NULL, ScrapReasonID INT NULL, ModifiedDate DATETIME NOT NULL );

CREATE CLUSTERED INDEX ix_WorkORder_DueDate

ON dbo.WorkOrder (DueDate)

ON psYearsAll(DueDate);

The next script inserts 7,259,100 rows into theWorkOrdertable in 2 minutes and 42 seconds, as

confirmed by the database Summary page:

Trang 4

DECLARE @Counter INT;

SET @Counter = 0;

WHILE @Counter < 100

BEGIN

SET @Counter = @Counter + 1;

INSERT dbo.WorkOrder (ProductID, OrderQty, StockedQty, ScrappedQty,

StartDate, EndDate, DueDate, ScrapReasonID, ModifiedDate)

SELECT WorkOrderID, ProductID, OrderQty, StockedQty, ScrappedQty,

StartDate, EndDate, DueDate, ScrapReasonID, ModifiedDate

FROM AdventureWorks.Production.WorkOrder;

END;

It’s possible for multiple partition schemas to share a single partition function Architecturally, this might

make sense if several tables should be partitioned using the same boundaries, because this improves the

consistency of the partitions To verify which tables use which partition schemes, based on which

par-tition functions, use the Object Dependencies dialog for the parpar-tition function or parpar-tition scheme You

can find it using the partition function’s context menu

To see information about how the partitions are being used, look at sys.partitions and

sys.partition_counts

Querying partition tables

The nice thing about partition tables is that no special code is required to query either across multiple

underlying partition tables or from only one partition table The Query Optimizer automatically uses the

right tables to retrieve the data

The$partitionoperator can return the partition table’s integer identifier when used with the

partition function The next code snippet counts the number of rows in each partition:

SELECT $PARTITION.pfYearsRT(DueDate) AS Partition,

COUNT(*) AS Count

FROM WorkOrder

GROUP BY $PARTITION.pfYearsRT(DueDate)

ORDER BY Partition;

Result:

Partition Count

-

The next query selects data for one year, so the data should be located in only one partition Examining

the query execution plan (not shown here) reveals that the Query Optimizer used a high-speed clustered

index scan on partition IDPtnIds1005:

Trang 5

SELECT WorkOrderID,ProductID, OrderQty, StockedQty, ScrappedQty FROM dbo.WorkOrder

WHERE DueDate between ‘1/1/2002’ and ‘12/31/2002’

Altering partition tables

In order for partition tables to be updated to keep up with changing data, and to enable the

perfor-mance testing of various partition schemes, they are easily modified Even though the commands are

simple, modifying the design of partition tables never executes very quickly, as you can imagine

Merging partitions

Merge and split surgically modify the table partition design TheALTER PARTITION MERGE

RANGEcommand effectively removes one of the boundaries from the partition function and merges two

partitions For example, to remove the boundary between 2003 and 2004 in thepfYearsRTpartition

function, and combine the data from 2003 and 2004 into a single partition, use the followingALTER

command:

ALTER PARTITION FUNCTION pfYearsRT() MERGE RANGE (’1/1/2004’);

Sure enough, following the merge operation, the previous count-rows-per-partition query now returns

three partitions, and scripting the partition function from Object Explorer creates a script with three

boundaries in the partition function code

If multiple tables share the same partition scheme and partition function being modified, then multiple tables will be affected by these changes.

Splitting partitions

To split an existing single partition, the first step is to designate the next filegroup to be used by the

partition scheme This is done using theALTER PARTITION NEXT USEDcommand If you specify

too many filegroups when creating a scheme, you will get a message that the next filegroup used is the

extra file group you specified Then the partition function can be modified to specify the new boundary

using theALTER PARTITION SPLIT RANGEcommand to insert a new boundary into the partition

function It’s theALTER FUNCTIONcommand that actually performs the work

This example segments the 2003–2004 work order data into two partitions The new partition will

include only data for July 2004, the last month with data in theAdventureWorkstable:

ALTER PARTITION SCHEME psYearsFiles NEXT USED [Primary];

Trang 6

ALTER PARTITION FUNCTION pfYearsRT()

SPLIT RANGE (’7/1/2004’);

Switching tables

Switching tables is the cool capability to move an entire table into a partition within a partitioned

table, or to remove a single partition so that it becomes a stand-alone table This is very useful when

importing new data, but note a few restrictions:

■ Every index for the partition table must be a partitioned index

■ The new table must have the same columns (excluding identity columns), indexes, and

con-straints (including foreign keys) as the partition table, except that the new table cannot be

partitioned

■ The source partition table cannot be the target of a foreign key

■ Neither table can be published using replication, or have schema-bound views

■ The new table must have check constraint restricting the data range to the new partition, so

SQL Server doesn’t have to re-verify the data range (and it needs to be validated; no point

loading and then creating the constraint with nocheck)

■ Both the stand-alone table and the partition that will receive the stand-alone table must be on

the same filegroup

■ The receiving partition or table must be empty

In essence, switching a partition is rearranging the database metadata to reassign the existing table as a

partition No data is actually moved, which makes table switching nearly instantaneous regardless of the

table’s size

Prepping the new table

TheWorkOrderNEWtable will be created to demonstrate switching It will hold August 2004 data from

AdventureWorks:

CREATE TABLE dbo.WorkOrderNEW (

WorkOrderID INT IDENTITY NOT NULL,

ProductID INT NOT NULL,

OrderQty INT NOT NULL,

StockedQty INT NOT NULL,

ScrappedQty INT NOT NULL,

StartDate DATETIME NOT NULL,

EndDate DATETIME NOT NULL,

DueDate DATETIME NOT NULL,

ScrapReasonID INT NULL,

ModifiedDate DATETIME NOT NULL

)

ON Part05;

Trang 7

Indexes identical to those on the preceding table will be created on the partitioned table:

ALTER TABLE dbo.WorkOrderNEW ADD CONSTRAINT WorkOrderNEWPK PRIMARY KEY NONCLUSTERED (WorkOrderID, DueDate) go CREATE CLUSTERED INDEX ix_WorkOrderNEW_DueDate

ON dbo.WorkOrderNEW (DueDate)

The following adds the mandatory constraint:

ALTER TABLE dbo.WorkOrderNEW ADD CONSTRAINT WONewPT CHECK (DueDate BETWEEN ‘8/1/2004’ AND ‘8/31/2004’);

Now import the new data fromAdventureWorks, reusing the January 2004 data:

INSERT dbo.WorkOrderNEW (ProductID, OrderQty, StockedQty, ScrappedQty,

StartDate, EndDate, DueDate, ScrapReasonID, ModifiedDate) SELECT

ProductID, OrderQty, StockedQty, ScrappedQty, DATEADD(mm,7,StartDate), DATEADD(mm,7,EndDate), DATEADD(mm,7,DueDate), ScrapReasonID, DATEADD(mm,7,ModifiedDate) FROM AdventureWorks.Production.WorkOrder

WHERE DueDate BETWEEN ‘1/1/2004’ and ‘1/31/2004’;

The new table now has 3,158 rows

Prepping the partition table

The original partition table, built earlier in this section, has a non-partitioned, non-clustered primary

key Because one of the rules of switching into a partitioned table is that every index must be

parti-tioned, the first task for this example is to drop and rebuild theWorkOrdertable’s primary key so it

will be partitioned:

ALTER TABLE dbo.WorkOrder DROP CONSTRAINT WorkOrderPK

ALTER TABLE dbo.WorkOrder ADD CONSTRAINT WorkOrderPK PRIMARY KEY NONCLUSTERED (WorkORderID,DueDate)

ON psYearsAll(DueDate);

Next, the partition table needs an empty partition:

ALTER PARTITION SCHEME psYearsFiles NEXT USED [Primary]

ALTER PARTITION FUNCTION pfYearsRT() SPLIT RANGE (’8/1/2004’)

Trang 8

Performing the switch

TheALTER TABLE SWITCH TOcommand will move the new table into a specific partition To

determine the empty target partition, select the database Summary page ➪ Disk Usage report:

ALTER TABLE WorkOrderNEW

SWITCH TO WorkOrder PARTITION 5

Switching out

The same technology can be used to switch a partition out of the partition table so that it becomes a

stand-alone table Because no merger is taking place, this is much easier than switching in The

follow-ing code takes the first partition out of theWorkOrderpartition table and reconfigures the database

metadata so it becomes its own table:

ALTER TABLE WorkOrder

SWITCH PARITION 1 to WorkOrderArchive

Rolling partitions

With a little imagination, the technology to create and merge existing partitions can be used to create

rolling partition designs

Rolling partitions are useful for time-based partition functions such as partitioning a year of data into

months Each month, the rolling partition expands for a new month To build a 13-month rolling

partition, perform these steps each month:

1 Add a new boundary.

2 Point the boundary to the next used filegroup.

3 Merge the oldest two partitions to keep all the data.

Switching tables into and out of partitions can enhance the rolling partition designs by switching in fully

populated staging tables and switching out the tables into an archive location

Indexing partitioned tables

Large tables mean large indexes, so non-clustered indexes can be optionally partitioned

Creating partitioned indexes

Partitioned non-clustered indexes must include the column used by the partition function in the index,

and must be created using the sameONclause as the partitioned clustered index:

CREATE INDEX WorkOrder_ProductID

ON WorkOrder (ProductID, DueDate)

ON psYearsFiles(DueDate);

Trang 9

Maintaining partitioned indexes

One of the advantages of partitioned indexes is that they can be individually maintained The following

example rebuilds the newly added fifth partition:

ALTER INDEX WorkOrder_ProductID

ON dbo.WorkOrder REBUILD

PARTITION = 5

Removing partitioning

To remove the partitioning of any table, drop the clustered index and add a new clustered index

without the partitioningONclause When dropping the clustered index, you must add theMOVE TO

option to actually consolidate the data onto the specified filegroup, thus removing the partitioning from

the table:

DROP INDEX ix_WorkOrder_DueDate

ON dbo.Workorder WITH (MOVE TO [Primary]);

Data-Driven Partitioning

The third method doesn’t involve any Microsoft partitioning technology Instead, it’s an architectural

pattern that I’ve used in large, heavy transaction databases It’s rather simple, but very fast

A data-driven partitioning scheme segments the data into different servers based on a partition key

Each server has the same database schema, but the data stored is only the required data partition key or

ranges For example, server A could hold accounts 1–999 Server B could hold accounts 1,000–1,999

Server C could hold all accounts greater than or equal to 2,000

A partition mapping table stores the server name for each partition key value or range of values In the

previous example, the partition key table would hold the from and to account numbers and the server

name

The middle tier reads and caches the partition mapping table, and for every database access it checks

the partition mapping table to determine which server holds the needed data

This method works best when the data is self-contained and the complete query can be solved using

only the subset of data If the servers need to do much cross-server querying to solve the queries, then

the benefits are likely lost

What’s nice about data-driven partitioning is that it’s very easy to scale out Adding another server only

requires moving some data and updating the partition-mapping table

Trang 10

Not every database will have to scale to higher magnitudes of capacity, but when a project does grow

into the terabytes, SQL Server 2008 provides some advanced technologies to tackle the growth

However, even these advanced technologies are no substitute for Smart Database Design

Key points on partitioning include the following:

■ Partitioned views use a union all to merge data from several user-created base tables Each

partition table must include the partition key and a constraint

■ The Query Processor can carefully choose the minimum number of underlying tables when

selecting through a partitioned view, but not when updating

■ Distributed partitioned views add distributed queries to combine data from multiple servers

■ Partitioned tables are a completely different technology than partitioned views and use a

partition function, schema, and clustered index to partition a single table

■ Data-driven partitioning is an architectural pattern that involves custom coding, but it delivers

the best possible scale-out performance and flexibility

The next chapter wraps up this part covering optimization with a new feature for SQL Server 2008

Enterprise Edition that’s getting quite a bit of buzz

Định dạng
Số trang	10
Dung lượng	526,14 KB