Hướng dẫn học Microsoft SQL Server 2008 part 29 pps

Old Thing Red Thing New Thing Blue Thing Plane Cycle Train Car An inner join between tableOneand tableTwowill return only the two matching rows: SELECT Thing1, Thing2 FROM dbo.One INNER

Trang 1

INSERT dbo.One(OnePK, Thing1) VALUES (2, ‘New Thing’);

INSERT dbo.One(OnePK, Thing1) VALUES (3, ‘Red Thing’);

INSERT dbo.One(OnePK, Thing1) VALUES (4, ‘Blue Thing’);

INSERT dbo.Two(TwoPK, OnePK, Thing2) VALUES(1,0, ‘Plane’);

INSERT dbo.Two(TwoPK, OnePK, Thing2) VALUES(2,2, ‘Train’);

INSERT dbo.Two(TwoPK, OnePK, Thing2) VALUES(3,3, ‘Car’);

INSERT dbo.Two(TwoPK, OnePK, Thing2) VALUES(4,NULL, ‘Cycle’);

FIGURE 10-9

The Red Thing Blue Thing example has data to view every type of join

Old Thing

Red Thing New Thing Blue Thing

Plane Cycle Train

Car

An inner join between tableOneand tableTwowill return only the two matching rows:

SELECT Thing1, Thing2 FROM dbo.One

INNER JOIN dbo.Two

ON One.OnePK = Two.OnePK;

Result:

-New Thing Train

Red Thing Car

A left outer join will extend the inner join and include the rows from tableOnewithout a match:

LEFT OUTER JOIN dbo.Two

Trang 2

All the rows are now returned from tableOne, but two rows are still missing from tableTwo:

-Old Thing NULL

New Thing Train

Red Thing Car

Blue Thing NULL

A full outer join will retrieve every row from both tables, regardless of a match between the tables:

SELECT Thing1, Thing2

FROM dbo.One

FULL OUTER JOIN dbo.Two

The plane and cycle from tableTwoare now listed along with every row from tableOne:

-Old Thing NULL

New Thing Train

Red Thing Car

Blue Thing NULL

As this example shows, full outer joins are an excellent tool for finding all the data, even bad data Set

difference queries, explored later in this chapter, build on outer joins to zero in on bad data

Placing the conditions within outer joins

When working with inner joins, a condition has the same effect whether it’s in theJOINclause or the

WHEREclause, but that’s not the case with outer joins:

■ When the condition is in theJOINclause, SQL Server includes all rows from the outer table

and then uses the condition to include rows from the second table

■ When the restriction is placed in theWHEREclause, the join is performed and then theWHERE

clause is applied to the joined rows

The following two queries demonstrate the effect of the placement of the condition

In the first query, the left outer join includes all rows from tableOneand then joins those rows from

tableTwowhereOnePKis equal in both tables andThing1’s value isNew Thing The result is all the

rows from tableOne, and rows from tableTwothat meet both join restrictions:

FROM dbo.One

LEFT OUTER JOIN dbo.Two

Trang 3

ON One.OnePK = Two.OnePK AND One.Thing1 = ‘New Thing’;

Result:

-Old Thing NULL

New Thing Train Red Thing NULL Blue Thing NULL The second query first performs the left outer join, producing the same four rows as the previous query

but without theANDcondition TheWHEREclause then restricts that result to those rows whereThing1

is equal toNew Thing1 The net effect is the same as when an inner join was used (but it might take

more execution time):

LEFT OUTER JOIN dbo.Two

ON One.OnePK = Two.OnePK

WHERE One.Thing1 = ‘New Thing’;

Result:

-New Thing Train

Multiple outer joins

Coding a query with multiple outer joins can be tricky Typically, the order of data sources in theFROM

clause doesn’t matter, but here it does The key is to code them in a sequential chain Think through it

this way:

1 Grab all the customers regardless of whether they’ve placed any orders.

2 Then grab all the orders regardless of whether they’ve shipped.

3 Then grab all the ship details.

When chaining multiple outer joins, stick to left outer joins, as mixing left and right outer joins

becomes very confusing very fast Be sure to unit test the query with a small sample set of data to

ensure that the outer join chain is correct

Self-Joins

A self-join is a join that refers back to the same table This type of unary relationship is often used to

extract data from a reflexive (also called a recursive) relationship, such as organizational charts (employee

to boss) Think of a self-join as a table being joined with a temporary copy of itself

Trang 4

TheFamilysample database uses two self-joins between a child and his or her parents, as shown in

the database diagram in Figure 10-10 The mothers and fathers are also people, of course, and are listed

in the same table They link back to their parents, and so on The sample database is populated with

five fictitious generations that can be used for sample queries

FIGURE 10-10

The database diagram of the Family database includes two unary relationships (children to parents)

on the left and a many-to-many unary relationship (husband to wife) on the right

The key to constructing a self-join is to include a second reference to the table using a table alias Once

the table is available twice to theSELECTstatement, the self-join functions much like any other join In

the following example, thedbo.Persontable is referenced using the table aliasMother:

Switching over to theFamilysample database, the following query locates the children of Audry

Hal-loway:

USE Family;

SELECT Child.PersonID, Child.FirstName,

Child.MotherID, Mother.PersonID

Trang 5

FROM dbo.Person AS Child INNER JOIN dbo.Person AS Mother

ON Child.MotherID = Mother.PersonID WHERE Mother.LastName = ‘Halloway’

AND Mother.FirstName = ‘Audry’;

The query uses thePersontable twice The first reference (aliased asChild) is joined with the

sec-ond reference (aliased asMother), which is restricted by theWHEREclause to only Audry Halloway

Only the rows with aMotherIDthat points back to Audry will be included in the inner join Audry’s

PersonIDis 6 and her children are as follows:

PersonID FirstName MotherID PersonID

While the previous query adequately demonstrates a self-join, it would be more useful if the mother

weren’t hard-coded in theWHEREclause, and if more information were provided about each birth, as

follows:

SELECT CONVERT(NVARCHAR(15),C.DateofBirth,1) AS Date, C.FirstName AS Name, C.Gender AS G,

ISNULL(F.FirstName + ‘ ‘ + F.LastName, ‘ * unknown *’)

as Father, M.FirstName + ‘ ‘ + M.LastName as Mother FROM dbo.Person AS C

LEFT OUTER JOIN dbo.Person AS F

ON C.FatherID = F.PersonID INNER JOIN dbo.Person AS M

ON C.MotherID = M.PersonID

ORDER BY C.DateOfBirth;

This query makes three references to thePersontable: the child, the father, and the mother, with

mnemonic one-letter aliases The result is a better listing:

- - - -5/19/22 James M James Halloway Kelly Halloway 8/05/28 Audry F Bryan Miller Karen Miller 8/19/51 Melanie F James Halloway Audry Halloway 8/30/53 James M James Halloway Audry Halloway 2/12/58 Dara F James Halloway Audry Halloway 3/13/61 Corwin M James Halloway Audry Halloway 3/13/65 Cameron M Richard Campbell Elizabeth Campbell

.

For more ideas about working with hierarchies and self-joins, refer to Chapter 17,

‘‘Traversing Hierarchies.’’

Trang 6

Cross (Unrestricted) Joins

The cross join, also called an unrestricted join, is a pure relational algebra multiplication of the two

source tables Without a join condition restricting the result set, the result set includes every possible

combination of rows from the data sources Each row in data set one is matched with every row in data

set two — for example, if the first data source has five rows and the second data source has four rows,

a cross join between them would result in 20 rows This type of result set is referred to as a Cartesian

product.

Using theOne/Twosample tables, a cross join is constructed in Management Studio by omitting the join

condition between the two tables, as shown in Figure 10-11

FIGURE 10-11

A graphical representation of a cross join is simply two tables without a join condition

In code, this type of join is specified by the keywordsCROSS JOINand the lack of anONcondition:

FROM dbo.One

CROSS JOIN dbo.Two;

Trang 7

The result of a join without restriction is that every row in tableOnematches with every row from table

Two:

-Old Thing Plane

New Thing Plane Red Thing Plane Blue Thing Plane Old Thing Train New Thing Train Red Thing Train Blue Thing Train Old Thing Car New Thing Car Red Thing Car Blue Thing Car Old Thing Cycle New Thing Cycle Red Thing Cycle Blue Thing Cycle Sometimes cross joins are the result of someone forgetting to draw the join in a graphical-query tool;

however, they are useful for populating databases with sample data, or for creating empty ‘‘pidgin hole’’

rows for population during a procedure

Understanding how a cross join multiplies data is also useful when studying relational division, the

inverse of relational multiplication Relational division requires subqueries, so it’s explained in the next

chapter

Exotic Joins

Nearly all joins are based on a condition of equality between the primary key of a primary table and the

foreign key of a secondary table, which is why the inner join is sometimes called an equi-join Although

it’s commonplace to base a join on a single equal condition, it is not a requirement The condition

between the two columns is not necessarily equal, nor is the join limited to one condition

TheONcondition of the join is in reality nothing more than aWHEREcondition restricting the product

of the two joined data sets Where-clause conditions may be very flexible and powerful, and the same is

true of join conditions This understanding of theONcondition enables the use of three powerful

tech-niques: (theta) joins, multiple-condition joins, and non-key joins.

Multiple-condition joins

If a join is nothing more than a condition between two data sets, then it makes sense that multiple

con-ditions are possible at the join In fact, multiple-condition joins and joins go hand-in-hand Without

the ability to use multiple-condition joins, joins would be of little value.

Trang 8

If the database schema uses natural primary keys, then there are probably tables with composite primary

keys, which means queries must use multiple-condition joins

Join conditions can refer to any table in theFROMclause, enabling interesting three-way joins:

FROM A

INNER JOIN B

ON A.col = B.col

INNER JOIN C

ON B.col = C.col

AND A.col = C.col;

The first query in the previous section, ‘‘Placing the Conditions within Outer Joins,’’ was a

multiple-condition join

(theta) joins

A theta join (depicted throughout as ) is a join based on a non-equaloncondition In relational

the-ory, conditional operators (=, >, <, >=, <=, <>) are called operators While the equals

condi-tion is technically a operator, it is commonly used, so only joins with condicondi-tions other than equal are

referred to as joins.

The condition may be set within Management Studio’s Query Designer using the join Properties

dia-log, as previously shown in Figure 10-7

Non-key joins

Joins are not limited to primary and foreign keys The join can match a row in one data source with a

row in another data source using any column, as long as the columns share compatible data types and

the data match

For example, an inventory allocation system would use a non-key join to find products that are expected

to arrive from the supplier before the customer’s required ship date A non-key join between the

PurchaseOrderandOrderDetailtables with a condition betweenPO.DateExpectedand

OD.DateRequiredwill filter the join to those products that can be allocated to the customer’s orders

The following code demonstrates the non-key join (this is not in a sample database):

SELECT OD.OrderID, OD.ProductID, PO.POID

FROM OrderDetail AS OD

INNER JOIN PurchaseOrder AS PO

ON OD.ProductID = PO.ProductID

AND OD.DateRequired > PO.DateExpected;

When working with inner joins, non-key join conditions can be placed in theWHEREclause or in the

JOIN Because the conditions compare similar values between two joined tables, I often place these

con-ditions in theJOINportion of theFROMclause, rather than theWHEREclause The critical difference

depends on whether you view the conditions as a part of creating the record set upon which the rest

of the SQLSELECTstatement is acting, or as a filtering task that follows theFROMclause Either way,

the query-optimization plan is identical, so use the method that is most readable and seems most logical

Trang 9

to you Note that when constructing outer joins, the placement of the condition in theJOINor in the

WHEREclause yields different results, as explained earlier in the section ‘‘Placing the Conditions within

Outer Joins.’’

Asking the question, ‘‘Who are twins?’’ of theFamilysample database uses all three exotic

join techniques in the join between person and twin The join contains three conditions The

Person.PersonID <> Twin.PersonIDcondition is a join that prevents a person from being

considered his or her own twin The join condition onMotherID, while a foreign key, is nonstandard

because it is being joined with another foreign key TheDateOfBirthcondition is definitely a non-key

join condition:

SELECT Person.FirstName + ‘ ‘ + Person.LastName AS Person, Twin.FirstName + ‘ ‘ + Twin.LastName AS Twin,

Person.DateOfBirth FROM dbo.Person INNER JOIN dbo.Person AS Twin

ON Person.PersonID <> Twin.PersonID

AND Person.MotherID = Twin.MotherID AND Person.DateOfBirth = Twin.DateOfBirth;

The following is the same query, this time with the exotic join condition moved to theWHEREclause

Not surprisingly, SQL Server’s Query Optimizer produces the exact same query execution plan for each

query:

SELECT Person.FirstName + ‘ ‘ + Person.LastName AS Person, Twin.FirstName + ‘ ‘ + Twin.LastName AS Twin,

Person.DateOfBirth FROM dbo.Person INNER JOIN dbo.Person AS Twin

ON Person.MotherID = Twin.MotherID AND Person.DateOfBirth = Twin.DateOfBirth

WHERE Person.PersonID <> Twin.PersonID;

Result:

- - -Abbie Halloway Allie Halloway 1979-010-14 00:00:00.000 Allie Halloway Abbie Halloway 1979-010-14 00:00:00.000 The difficult query scenarios at the end of the next chapter also demonstrate exotic joins, which are

often used with subqueries

Set Difference Queries

A query type that’s useful for analyzing the correlation between two data sets is a set difference query,

sometimes called a left (or right) anti-semi join, which finds the difference between the two data sets

based on the conditions of the join In relational algebra terms, it removes the divisor from the dividend,

Trang 10

leaving the difference This type of query is the inverse of an inner join Informally, it’s called a find

unmatched rows query.

Set difference queries are great for locating out-of-place data or data that doesn’t match, such as rows

that are in data set one but not in data set two (see Figure 10-12)

FIGURE 10-12

The set difference query finds data that is outside the intersection of the two data sets

Old Thing

Red Thing

New Thing

Blue Thing

Plane Cycle Train

Car

Table Two Table One

Set Difference

Set

Difference

Left set difference query

A left set difference query finds all the rows on the left side of the join without a match on the right side

of the joins

Using theOneandTwosample tables, the following query locates all rows in tableOnewithout a

match in tableTwo, removing set two (the divisor) from set one (the dividend) The result will be the

rows from set one that do not have a match in set two

The outer join already includes the rows outside the intersection, so to construct a set difference query

use anOUTER JOINwith anIS NULLrestriction on the second data set’s primary key This will return

all the rows from tableOnethat do not have a match in tableTwo:

USE tempdb;

FROM dbo.One

LEFT OUTER JOIN dbo.Two

ON One.OnePK = Two.OnePK

WHERE Two.TwoPK IS NULL;

TableOne’s difference is as follows:

Định dạng
Số trang	10
Dung lượng	690,5 KB