Hướng dẫn học Microsoft SQL Server 2008 part 47 pps

Searching down the hierarchy with materialized path Navigating down the hierarchy and returning a subtree of all nodes under a given node is where the materialized path method really shi

Trang 1

Subtree queries

The primary work of a hierarchy is returning the hierarchy as a set The adjacency list method used

sim-ilar methods for scanning up or down the hierarchy Not so with materialized path Searching down a

materialized path is a piece of cake, but searching up the tree is a real pain

Searching down the hierarchy with materialized path

Navigating down the hierarchy and returning a subtree of all nodes under a given node is where the

materialized path method really shines

Check out the simplicity of this query:

SELECT BusinessEntityID, ManagerID, MaterializedPath FROM HumanResources.Employee

WHERE MaterializedPath LIKE ‘1,263,%’

Result:

BusinessEntityID ManagerID MaterializedPath - -

That’s all it takes to find a node’s subtree Because the materialized path for every node in the subtree

is just a string that begins with the subtree’s parent’s materialized path, it’s easily searched with aLIKE

function and a%wildcard in theWHEREclause

It’s important that theLIKEsearch string includes the comma before the%wildcard; otherwise,

search-ing for1,263%would find1,2635, which would be an error, of course

Searching up the hierarchy with materialized path

Searching up the hierarchy means searching for the all the ancestors, or the chain of command, for a

given node The nice thing about a materialized path is that the full list of ancestors is right there in the

materialized path There’s no need to read any other rows

Therefore, to get the parent nodes, you need to parse the materialized path to return the IDs of each

parent node and then join to this set of IDs to get the parent nodes

The trick is to extract it quickly Unfortunately, SQL Server lacks a simple split function There are two

options: build a CLR function that uses the C# split function or build a T-SQL scalar user-defined

func-tion to parse the string

Trang 2

A C# CLR function to split a string is a relatively straightforward task:

using Microsoft.SqlServer.Server;

using System.Data.SqlClient;

using System;using System.Collections;

public class ListFunctionClass

{

[SqlFunction(FillRowMethodName = "FillRow",

TableDefinition = "list nvarchar(max)")]

public static IEnumerator ListSplitFunction(string list)

{

string[] listArray = list.Split(new char[] {’,’});

Array array = listArray;

return array.GetEnumerator();

}

public static void FillRow(Object obj, out String sc)

{

sc = (String)obj;

}

Adam Machanic, SQL Server MVP and one of the sharpest SQL Server programmers

around, went on a quest to write the fastest CLR split function possible The result is

posted on SQLBlog.com at http://tinyurl.com/dycmxb

But I’m a T-SQL guy, so unless there’s a compelling need to use CLR, I’ll opt for T-SQL There are

a number of T-SQL string-split solutions available I’ve found that the performance depends on the

length of the delimited strings Erland Sommerskog’s website analyzes several T-SQL split solutions:

http://www.sommarskog.se/arrays-in-sql-2005.html

Of Erland’s solutions, the one I prefer for shorter length strings such as these is in theParseString

user-defined function:

up the hierarchy

parse the string

CREATE

alter

FUNCTION dbo.ParseString (@list varchar(200))

RETURNS @tbl TABLE (ID INT) AS

BEGIN

code by Erland Sommarskog

Trang 3

DECLARE @valuelen int,

@nextpos int SELECT @pos = 0, @nextpos = 1 WHILE @nextpos > 0

BEGIN SELECT @nextpos = charindex(’,’, @list, @pos + 1) SELECT @valuelen = CASE WHEN @nextpos > 0

THEN @nextpos ELSE len(@list) + 1 END - @pos - 1

INSERT @tbl (ID) VALUES (substring(@list, @pos + 1, @valuelen)) SELECT @pos = @nextpos

END RETURN END

go SELECT ID FROM HumanResources.Employee

CROSS APPLY dbo.ParseString(MaterializedPath)

WHERE BusinessEntityID = 270

go DECLARE @MatPath VARCHAR(200) SELECT @MatPath = MaterializedPath FROM HumanResources.Employee WHERE BusinessEntityID = 270 SELECT E.BusinessEntityID, MaterializedPath

FROM dbo.ParseString(@MatPath)

JOIN HumanResources.Employee E

ON ParseString.ID = E.BusinessEntityID ORDER BY MaterializedPath

Is the node in the subtree?

Because the materialized-path pattern is so efficient at finding subtrees, the best way to determine

whether a node is in a subtree is to reference theWHERE-like subtree query in aWHEREclause, similar

to the adjacency list solution:

Trang 4

Does 270 work for 263

SELECT ‘True’

WHERE 270 IN

(SELECT BusinessEntityID

FROM HumanResources.Employee

WHERE MaterializedPath LIKE ‘1,263,%’)

Determining the node level

Determining the current node level using the materialized-path pattern is as simple as counting the

com-mas in the materialized path The following function usesCHARINDEXto locate the commas and make

quick work of the task:

CREATE FUNCTION MaterializedPathLevel

(@Path VARCHAR(200))

RETURNS TINYINT

AS

BEGIN

DECLARE

@Position TINYINT = 1,

@Lv TINYINT = 0;

WHILE @Position >0

BEGIN;

SET @Lv += 1;

SELECT @Position = CHARINDEX(’,’, @Path, @Position + 1 );

END;

RETURN @Lv - 1

END;

Testing the function:

SELECT dbo.MaterializedPathLevel(’1,20,56,345,1010’)

As Level

Result:

Level

-6

A function may be easily called within an update query, so pre-calculating and storing the level is a

triv-ial process The next script adds aLevelcolumn, updates it using the new function, and then takes a

look at the data:

ALTER TABLE HumanResources.Employee

ADD Level TINYINT

UPDATE HumanResources.Employee

SET Level = dbo.MaterializedPathLevel(MaterializedPath)

Trang 5

SELECT BusinessEntityID, MaterializedPath, Level FROM HumanResources.Employee

Result (abbreviated):

BusinessEntityID MaterializedPath Level

Storing the level can be useful; for example, being able to query the node’s level makes writing

single-level queries significantly easier Using the function in a persisted calculated column with an index

works great

Single-level queries

Whereas the adjacency list pattern was simpler for doing single-level queries, rather than returning

complete subtrees, the materialized-path pattern excels at returning subtrees, but it’s more difficult to

return just a single level Although neither solution excels at returning a specific level in a hierarchy

on its own, it is possible with the adjacency pattern but requires some recursive functionality For the

materialized-path pattern, if the node’s level is also stored in table, then the level can be easily added to

theWHEREclause, and the queries become simple

This query locates all the nodes one level down from the CEO The CTE locates the

MaterializedPathand theLevelfor the CEO, and the main query’s join conditions filter

the query to the next level down:

Query Search 1 level down WITH CurrentNode(MaterializedPath, Level) AS

(SELECT MaterializedPath, Level FROM HumanResources.Employee WHERE BusinessEntityID = 1) SELECT BusinessEntityID, ManagerID, E.MaterializedPath, E.Level FROM HumanResources.Employee E

JOIN CurrentNode C

ON E.MaterializedPath LIKE C.MaterializedPath + ‘%’

AND E.Level = C.Level + 1

Trang 6

BusinessEntityID ManagerID MaterializedPath Level

-

An advantage of this method over the single join method used for finding single-level queries for the

adjacency list pattern is that this method can be used to find any specific level, not just the nearest level

Locating the single-level query up the hierarchy is the same basic outer query, but the CTE/subquery

uses the up-the-hierarchy subtree query instead, parsing the materialized path string

Reparenting the materialized path

Because the materialized-path pattern stores the entire tree in the materialized path value in each node,

when the tree is modified by inserting, updating, or deleting a node, the entire affected subtree must

have its materialized path recalculated

Each node’s path contains the path of its parent node, so if the parent node’s path changes, so do the

children This will propagate down and affect all descendants of the node being changed

The brute force method is to reexecute the user-defined function that calculates the materialized path A

more elegant method, when it applies, is to use theREPLACET-SQL function

Indexing the materialized path

Indexing the materialized path requires only a non-clustered index on the materialized path column

Because the level column is used in some searches, depending on the usage, it’s also a candidate for a

non-clustered index If so, then a composite index of the level and materialized path columns would be

the best-performing option

Materialized path pros and cons

There are some points in favor of the materialized-path pattern:

■ The strongest point in its favor is that in contains the actual references to every node in its

hierarchy This gives the pattern considerable durability and consistency If a node is deleted

or updated accidentally, the remaining nodes in its subtree are not orphaned The tree can be

reconstructed If Jean Trenary is deleted, the materialized path of the IT department employees

remains intact

Trang 7

■ The materialized-path pattern is the only pattern that can retrieve an entire subtree with a single index seek It’s wicked fast

■ Reading a materialized path is simple and intuitive The keys are there to read in plain text

On the down side, there are a number of issues, including the following:

■ The key sizes can become large; at 10 levels deep with an integer key, the keys can be 40–80 bytes in size This is large for a key

■ Constraining the hierarchy is difficult without the use of triggers or complex check constraints

Unlike the adjacency list pattern, you cannot easily enforce that a parent node exists

■ Simple operations like ‘‘get me the parent node’’ are more complex without the aid of helper functions

■ Inserting new nodes requires calculating the materialized path, and reparenting the material-ized path requires recalculating the materialmaterial-ized paths for every node in the affected subtree

For an OLTP system this can be a very expensive operation and lead to a large amount of contention Offloading the maintenance of the hierarchy to a background process can alleviate this An option is to combine adjacency and path solutions; one provides ease of maintenance and one provides performance for querying

The materialized path is my favorite hierarchy pattern and the one I use in Nordic (my SQL Server

object relational fac¸ade) to store the class structure

Using the New HierarchyID

For SQL Server 2008, Microsoft has released a new data type targeted specifically at solving the

hierar-chy problem Working through the materialized-path pattern was a good introduction toHierarchyID

becauseHierarchyIDis basically a binary version of materialized path

HierarchyIDis implemented as a CLR data type with CLR methods, but you don’t need to enable

CLR to useHierarchyID Technically speaking, the CLR is always running Disabling the CLR only

disables installing and running user-programmed CLR assemblies

To jump right into theHierarchyID, this first query exposes the raw data The

OrganizationalNodecolumn in theHumanResources.Employeetable is aHierarchyID

column The second column simply returns the binary data fromOrganizationalNode The

third column,HierarchyID.ToString()uses the.ToString()method to converrt the

HierarchyIDdata to text The column returns the values stored in a caluculated column that’s set to

the.getlevel()method:

View raw HierarchyID Data

SELECT E.BusinessEntityID, P.FirstName + ‘ ‘ + P.LastName as ‘Name’,

OrganizationNode, OrganizationNode.ToString() as ‘HierarchyID.ToString()’, OrganizationLevel

FROM HumanResources.Employee E

Trang 8

JOIN Person.Person P

ON E.BusinessEntityID = P.BusinessEntityID

Result (abbreviated):

BusinessEntityID OrganizationNode HierarchyID.ToString() OrganizationLevel

- -

In the third column, you can see data that looks similar to the materialized path pattern, but there’s a

significant difference Instead of storing a delimited path of ancestor primary keys,HierarchyIDis

intended to store the relative node position, as shown in Figure 17-6

FIGURE 17-6

The AdventureWorks Information Services Department with HierarchyID nodes displayed

Adventure Works 2008 Information Service Department

1 Ken Sánchez

/

263 Jean Trenary

/5/

264

Stephanie Conroy

/5/1/

270 François Ajenstat

/5/5/

271 Dan Wilson

/5/6/

266 Peter Connelly

/5/1/2/

265 Ashvini Sharma

/5/1/1/

267 Karen Berg

/5/2/

268 Ramesh Meyyappan

/5/3/

269 Dan Bacon

/5/4/

272 Janaina Bueno

/5/7/

Trang 9

Walking through a few examples in this hierarchy, note the following:

■ The CEO is the root node, so hisHierarchyIDis just/

■ If all the nodes under Ken were displayed, then Jean would be the fifth node Her relative node position is the fifth node under Ken, so herHierarchyIDis/5/

■ Stephanie is the first node under Jean, so herHierarchyIDis/5/1/

■ Ashivini is the first node under Stephanie, so his node is/5/1/1/

Selecting a single node

Even thoughHierarchyIDstores the data in binary, it’s possible to filter by aHierarchyIDdata

type column in aWHEREclause using the text form of the data:

SELECT E.BusinessEntityID, P.FirstName + ‘ ‘ + P.LastName as ‘Name’, E.JobTitle

ON E.BusinessEntityID = P.BusinessEntityID WHERE OrganizationNode = ‘/5/5/’

Result:

- -

-270 Fran¸ cois Ajenstat Database Administrator

Scanning for ancestors

Searching for all ancestor nodes is relatively easy withHierarchyID There’s a great CLR method,

IsDescendantOf(), that tests any node to determine whether it’s a descendant of another node

and returns either true or false The followingWHEREclause tests each row to determine whether the

@EmployeeNodeis a descendent of that row’sOrganizationNode:

WHERE @EmployeeNode.IsDescendantOf(OrganizationNode) = 1

The full query returns the ancestor list for Franc¸ois The script must first store Franc¸ois’HierarchyID

value in a local variable Because the variable is aHierarchyID, theIsDescendantOf()method

may be applied The fourth column displays the same test used in theWHEREclause:

DECLARE @EmployeeNode HierarchyID

SELECT @EmployeeNode = OrganizationNode

WHERE OrganizationNode = ‘/5/5/’ Fran¸ cois Ajenstat the DBA

SELECT E.BusinessEntityID, P.FirstName + ‘ ‘ + P.LastName as ‘Name’, E.JobTitle,

@EmployeeNode.IsDescendantOf(OrganizationNode) as Test FROM HumanResources.Employee E

WHERE @EmployeeNode.IsDescendantOf(OrganizationNode) = 1

Trang 10

- -

263 Jean Trenary Information Services Manager 1

270 Fran¸ cois Ajenstat Database Administrator 1

Performing a subtree search

TheIsDescendantOf()method is easily flipped around to perform a subtree search locating all

descendants The trick is that either side of theIsDescendantOf()method can use a variable or

column In this case the variable goes in the parameter and the method is applied to the column The

result is the now familiar AdventureWorks Information Service Department:

DECLARE @ManagerNode HierarchyID

SELECT @ManagerNode = OrganizationNode

WHERE OrganizationNode = ‘/5/’ Jean Trenary - IT Manager

SELECT E.BusinessEntityID, P.FirstName + ‘ ‘ + P.LastName as ‘Name’,

OrganizationNode.ToString() as ‘HierarchyID.ToString()’,

OrganizationLevel

WHERE OrganizationNode.IsDescendantOf(@ManagerNode) = 1

Result:

BusinessEntityID Name HierarchyID.ToString() OrganizationLevel

- -

Single-level searches

Single-level searches were presented first for the adjcency list pattern because they were the

sim-pler searches ForHierarchyIDsearches, a single-level search is more complex and builds on

the previous searches In fact, a single-levelHierarchyIDsearch is really nothing more than an

Định dạng
Số trang	10
Dung lượng	497,88 KB