Hướng dẫn học Microsoft SQL Server 2008 part 33 doc

The outer query will group the orders with toys for each contact, and the subquery will count the number of products in the toy product category.. The outer query’s HAVINGclause will the

Trang 1

relational division query would list only those students who passed the required courses and no others.

A relational division with a remainder, also called an approximate divide, would list all the students who

passed the required courses and include students who passed any additional courses Of course, that

example is both practical and academic

Relational division is more complex than a join A join simply finds any matches between two data sets

Relational division finds exact matches between two data sets Joins/subqueries and relational division

solve different types of questions For example, the following questions apply to the sample databases

and compare the two methods:

■ Joins/subqueries:

■ CHA2: Who has ever gone on a tour?

■ CHA2: Who lives in the same region as a base camp?

■ CHA2: Who has attended any event in his or her home region?

■ Exact relational division:

■ CHA2: Who has gone on every tour in his or her home state but no tours outside it?

■ OBXKites: Who has purchased every kite but nothing else?

■ Family: Which women (widows or divorcees) have married the same husbands as each other, but no other husbands?

■ Relational division with remainders:

■ CHA2: Who has gone on every tour in his or her home state, and possibly other tours as well?

■ OBXKites: Who has purchased every kite and possibly other items as well?

■ Family: Which women have married the same husbands and may have married other men

as well?

Relational division with a remainder

Relational division with a remainder essentially extracts the quotient while allowing some leeway for

rows that meet the criteria but contain additional data as well In real-life situations this type of division

is typically more useful than an exact relational division

The previous OBX Kites sales question (‘‘Who has purchased every kite and possibly other items as

well?’’) is a good one to use to demonstrate relational division Because it takes five tables to go from

contact to product category, and because the question refers to the join betweenOrderDetailand

Product, this question involves enough complexity that it simulates a real-world relational-database

problem

The toy category serves as a good example category because it contains only two toys and no one has

purchased a toy in the sample data, so the query will answer the question ‘‘Who has purchased at least

one of every toy sold by OBX Kites?’’ (Yes, my kids volunteered to help test this query.)

First, the following data will mock up a scenario in theOBXKitesdatabase The only toys are

ProductCode1049 and 1050 TheOBXKitesdatabase uses unique identifiers for primary keys and

therefore uses stored procedures for all inserts The firstOrderandOrderDetailinserts will list the

stored procedure parameters so the following stored procedure calls are easier to understand:

USE OBXKites;

DECLARE @OrderNumber INT;

Trang 2

The first person,ContactCode110, orders exactly all toys:

EXEC pOrder_AddNew

@ContactCode = ‘110’,

@EmployeeCode = ‘120’,

@LocationCode = ‘CH’,

@OrderDate= ‘2002/6/1’,

@OrderNumber = @OrderNumber output;

EXEC pOrder_AddItem

@OrderNumber = @OrderNumber,

@Code = ‘1049’,

@NonStockProduct = NULL,

@Quantity = 12,

@UnitPrice = NULL,

@ShipRequestDate = ‘2002/6/1’,

@ShipComment = NULL;

EXEC pOrder_AddItem

@OrderNumber, ‘1050’, NULL, 3, NULL, NULL, NULL;

The second person,ContactCode111, orders exactly all toys — and toy 1050 twice:

EXEC pOrder_AddNew

‘111’, ‘119’, ‘JR’, ‘2002/6/1’, @OrderNumber output;

EXEC pOrder_AddItem

EXEC pOrder_AddNew

EXEC pOrder_AddItem

The third person,ContactCode112, orders all toys plus some other products:

EXEC pOrder_AddNew

EXEC pOrder_AddItem

The fourth person,ContactCode113, orders one toy:

EXEC pOrder_AddNew

Trang 3

EXEC pOrder_AddItem

In other words, only customers 110 and 111 order all the toys and nothing else Customer 112

pur-chases all the toys, as well as some kites Customer 113 is an error check because she bought only one

toy

At least a couple of methods exist for coding a relational-division query The original method, proposed

by Chris Date, involves using nested correlated subqueries to locate rows in and out of the sets A more

direct method has been popularized by Joe Celko: It involves comparing the row count of the dividend

and divisor data sets

Basically, Celko’s solution is to rephrase the question as ‘‘For whom is the number of toys ordered equal

to the number of toys available?’’

The query is asking two questions The outer query will group the orders with toys for each contact,

and the subquery will count the number of products in the toy product category The outer query’s

HAVINGclause will then compare the distinct count of contact products ordered that are toys against

the count of products that are toys:

Is number of toys ordered

SELECT Contact.ContactCode FROM dbo.Contact

JOIN dbo.[Order]

ON Contact.ContactID = [Order].ContactID JOIN dbo.OrderDetail

ON [Order].OrderID = OrderDetail.OrderID JOIN dbo.Product

ON OrderDetail.ProductID = Product.ProductID JOIN dbo.ProductCategory

ON Product.ProductCategoryID = ProductCategory.ProductCategoryID WHERE ProductCategory.ProductCategoryName = ‘Toy’

GROUP BY Contact.ContactCode HAVING COUNT(DISTINCT Product.ProductCode) = equal to number of toys available?

(SELECT Count(ProductCode) FROM dbo.Product

JOIN dbo.ProductCategory

ON Product.ProductCategoryID

= ProductCategory.ProductCategoryID WHERE ProductCategory.ProductCategoryName = ‘Toy’);

Result:

ContactCode -110

111 112

Trang 4

Some techniques in the previous query — namely, group by , having , and count() — are

explained in the next chapter, ‘‘Aggregating Data.’’

Exact relational division

Exact relational division finds exact matches without any remainder It takes the basic question of

rela-tional division with remainder and tightens the method so that the divisor will have no extra rows that

cause a remainder

In practical terms it means that the example question now asks, ‘‘Who has ordered only every toy?’’

If you address this query with a modified form of Joe Celko’s method, the pseudocode becomes ‘‘For

whom is the number of toys ordered equal to the number of toys available, and also equal to the total

number of products ordered?’’ If a customer has ordered additional products other than toys, then the

third part of the question eliminates that customer from the result set

The SQL code contains two primary changes to the previous query One, the outer query must find

both the number of toys ordered and the number of all products ordered It does this by finding the

toys purchased in a derived table and joining the two data sets Two, theHAVINGclause must be

modified to compare the number of toys available with both the number of toys purchased and the

number of all products purchased, as follows:

Exact Relational Division

Is number of all products ordered

SELECT Contact.ContactCode

FROM dbo.Contact

JOIN dbo.[Order]

ON Contact.ContactID = [Order].ContactID

JOIN dbo.OrderDetail

ON [Order].OrderID = OrderDetail.OrderID

JOIN dbo.Product

ON OrderDetail.ProductID = Product.ProductID

JOIN dbo.ProductCategory P1

ON Product.ProductCategoryID = P1.ProductCategoryID

JOIN

and number of toys ordered

(SELECT Contact.ContactCode, Product.ProductCode

FROM dbo.Contact JOIN dbo.[Order]

ON Contact.ContactID = [Order].ContactID JOIN dbo.OrderDetail

ON [Order].OrderID = OrderDetail.OrderID JOIN dbo.Product

ON OrderDetail.ProductID = Product.ProductID JOIN dbo.ProductCategory

ON Product.ProductCategoryID = ProductCategory.ProductCategoryID WHERE ProductCategory.ProductCategoryName = ‘Toy’

Trang 5

) ToysOrdered

ON Contact.ContactCode = ToysOrdered.ContactCode GROUP BY Contact.ContactCode

HAVING COUNT(DISTINCT Product.ProductCode) = equal to number of toys available?

(SELECT Count(ProductCode) FROM dbo.Product

JOIN dbo.ProductCategory

= ProductCategory.ProductCategoryID WHERE ProductCategory.ProductCategoryName = ‘Toy’) AND equal to the total number of any product ordered?

AND COUNT(DISTINCT ToysOrdered.ProductCode) = (SELECT Count(ProductCode)

FROM dbo.Product JOIN dbo.ProductCategory

= ProductCategory.ProductCategoryID WHERE ProductCategory.ProductCategoryName = ‘Toy’);

The result is a list of contacts containing the number of toys purchased (2) and the number of total

products purchased (2), both equal to the number of products available (2):

ContactCode -110

111

Composable SQL

Composable SQL, also called select from output or DML table source (in SQL Server BOL), is the ability

to pass data from an insert, update, or delete’s output clause to an outer query This is a very powerful

new way to build subqueries, and it can significantly reduce the amount of code and improve the

per-formance of code that needs to write to one table, and then, based on that write, write to another table

To track the evolution of composable SQL (illustrated in Figure 11-3), SQL Server has always had

DML triggers, which include the inserted and deleted virtual tables Essentially, these are a view to the

DML modification that fired the triggers The deleted table holds the before image of the data, and the

inserted table holds the after image

Since SQL Server 2005, any DML statement that modifies data (INSERT,UPDATE,DELETE,MERGE)

can have an optionalOUTPUTclause that canSELECTfrom the virtual inserted and deleted table The

OUTPUTclause can pass the data to the client or insert it directly into a table

Trang 6

The inserted and deleted virtual tables are covered in Chapter 26, ‘‘Creating DML

Trig-gers,’’ and the output clause is detailed in Chapter 15, ‘‘Modifying Data.’’

In SQL Server 2008, composable SQL can place the DML statements and itsOUTPUTclause in a

sub-query and then select from that subsub-query The primary benefit of composable SQL, as opposed to just

using theOUTPUTclause to insert into a table, is thatOUTPUTclause data may be further filtered and

manipulated by the outer query

FIGURE 11-3

Composable SQL is an evolution of the inserted and deleted tables

Output

Select From Output

Inserted Deleted

Insert

Select From

SQL 2008

SQL 2005

SQL 2000

DML

Insert, Update, Delete,

Merge

Client, table variable, temp tables, tables subquery

The following script first creates a table and then has a composable SQL query The subquery has an

UPDATEcommand with anOUTPUTclause TheOUTPUTclause passes theoldvalueandnewvalue

columns to the outer query The outer query filters outTestDataand then inserts it into theCompSQL

table:

CREATE TABLE CompSQL (oldvalue varchar(50), newvalue varchar(50));

INSERT INTO CompSQL (oldvalue, newvalue )

SELECT oldvalue, newvalue

FROM

(UPDATE HumanResources.Department

SET GroupName = ‘Composable SQL Test’

OUTPUT Deleted.GroupName as ‘oldvalue’,

Inserted.GroupName as ‘newvalue’

WHERE Name = ‘Sales’) Q;

Trang 7

SELECT oldvalue, newvalue FROM CompSQL

WHERE newvalue <> ‘TestData’;

Result:

- -Sales and Marketing Composable SQL Test

Note several restrictions on composable SQL:

■ The update DML in the subquery must modify a local table and cannot be a partitioned view

■ The composable SQL query cannot include nested composable SQL, aggregate function, sub-query, ranking function, full-text features, user-defined functions that perform data access, or thetextptrfunction

■ The target table must be a local base table with no triggers, no foreign keys, no merge replication, or updatable subscriptions for transactional replication

Summary

While the basic nuts and bolts of subqueries may appear simple, they open a world of possibilities, as

they enable you to build complex nested queries that pull and twist data into the exact shape that is

needed to solve a difficult problem As you continue to play with subqueries, I think you’ll agree that

herein lies the power of SQL — and if you’re still developing primarily with the GUI tools, this might

provide the catalyst to move you to developing SQL using the query text editor

A few key points from this chapter:

■ Simple subqueries are executed once and the results are inserted into the outer query

■ Subqueries can be used in nearly every portion of the query — not just as derived tables

■ Correlated subqueries refer to the outer query, so they can’t be executed by themselves Con-ceptually, the outer query is executed and the results are passed to the correlated subquery, which is executed once for every row in the outer query

■ You don’t need to memorize how to code relational division; just remember that if you need to join not on any row but every row, then relational division is the set-based solution to do the job

■ Composable SQL is useful if you need to write to multiple tables from a single transaction, but there are plenty of limitations

The previous chapters established the foundation for working with SQL, covering theSELECT

state-ment, expressions, joins, and unions, while this chapter expanded theSELECTwith powerful subqueries

and CTEs If you’re reading through this book sequentially, congratulations — you are now over the

hump of learning SQL If you can master relational algebra and subqueries, the rest is a piece of cake

The next chapter continues to describe the repertoire of data-retrieval techniques with aggregation

queries, where using subqueries pays off

Trang 8

Aggregating Data

IN THIS CHAPTER

Calculating sums and averages Statistical analysis

Grouping data within a query Solving aggravating aggregation problems

Generating cumulative totals Building crosstab queries with the case, pivot, and dynamic methods

The Information Architecture Principle in Chapter 2 implies that

informa-tion, not just data, is an asset Turning raw lists of keys and data into

useful information often requires summarizing data and grouping it in

meaningful ways While summarization and analysis can certainly be performed

with other tools, such as Reporting Services, Analysis Services, or an external tool

such as SAS, SQL is a set-based language, and a fair amount of summarizing and

grouping can be performed very well within the SQLSELECTstatement

SQL excels at calculating sums, max values, and averages for the entire data set

or for segments of data In addition, SQL queries can create cross-tabulations,

commonly known as pivot tables.

Simple Aggregations

The premise of an aggregate query is that instead of returning all the selected

rows, SQL Server returns a single row of computed values that summarizes the

original data set, as illustrated in Figure 12-1 More complex aggregate queries

can slice the selected rows into subsets and then summarize every subset

The types of aggregate calculations range from totaling the data to performing

basic statistical operations

It’s important to note that in the logical order of the SQL query, the aggregate

functions (indicated by the Summing function in the diagram) occur following

theFROMclause and theWHEREfilters This means that the data can be

assem-bled and filtered prior to being summarized without needing to use a subquery,

although sometimes a subquery is still needed to build more complex aggregate

queries (as detailed later in the ‘‘Aggravating Queries’’ section in this chapter.)

Trang 9

What’s New with Query Aggregations?

Microsoft continues to evolve T-SQL’s ability to aggregate data SQL Server 2005 included the capability

to roll your own aggregate functions using the NET CLR SQL Server 2008 expands this feature by

removing the 8,000-byte limit on intermediate results for CLR user-defined aggregate functions

The most significant enhancement to query aggregation in SQL Server 2008 is the ability to use grouping sets

to further define the CUBE and ROLLUP functions with the GROUP BY clause

WITH ROLLUPand WITH CUBE have been deprecated, as they are non-ISO-compliant syntax for special

cases of the ISO-compliant syntax They are replaced with the new, more powerful, syntax for ROLLUP and

CUBE

FIGURE 12-1

The aggregate function produces a single row result from a data set

Where

From

Col(s), Expr(s) Summing Single Row Data

Source(s)

Basic aggregations

SQL includes a set of aggregate functions, listed in Table 12-1, which can be used as expressions in the

SELECTstatement to return summary data

ON the WEBSITE

ON the WEBSITE The code examples for this chapter use a small table called RawData The code to create and populate this data set is at the beginning of the chapter’s script You can

download the script from www.SQLServerBible.com.

CREATE TABLE RawData ( RawDataID INT NOT NULL IDENTITY PRIMARY KEY, Region VARCHAR(10) NOT NULL,

Category CHAR(1) NOT NULL, Amount INT NULL,

SalesDate Date NOT NULL );

Trang 10

TABLE 12-1

Basic Aggregate Functions

Aggregate Function Data Type Supported Description

sum() Numeric Totals all the non-null values in the column

avg() Numeric Averages all the non-null values in the column The

result has the same data type as the input, so the input is often converted to a higher precision, such as avg(cast col as a float)

min() Numeric, string,

datetime

Returns the smallest number or the first datetime or the first string according to the current collation from the column

max() Numeric, string,

datetime

Returns the largest number or the last datetime or the last string according to the current collation from the column

Count[_big](*) Any data type

(row-based)

Performs a simple count of all the rows in the result set up to 2,147,483,647 The count_big() variation uses the bigint data type and can handle up to

2 ˆ 63-1 rows

Count[_big]

([distinct]

column)

Any data type (row-based)

Performs a simple count of all the rows with non-null values in the column in the result set up to

2,147,483,647 The distinct option eliminates duplicate rows Will not count blobs

This simple aggregate query counts the number of rows in the table and totals the Amount column In

lieu of returning the actual rows from theRawDatatable, the query returns the summary row with the

row count and total Therefore, even though there are 24 rows in theRawDatatable, the result is a

single row:

SELECT COUNT(*) AS Count,

SUM(Amount) AS [Sum]

FROM RawData;

Result:

Định dạng
Số trang	10
Dung lượng	1,06 MB