TheWHEREcondition restricts the result set to only those rows without a match in the[Order]table: USE OBXKites; SELECT Contact.LastName, Contact.FirstName FROM dbo.Contact LEFT OUTER JOI
Trang 1To see if the theory will fly in a real-world scenario from theOBXKitessample database, the following
code is a set difference query that locates all contacts who have not yet placed an order TheContact
table is the divisor and the set difference query removes the contacts with orders (the dividend) The left
outer join produces a data set with all contacts and matching orders TheWHEREcondition restricts the
result set to only those rows without a match in the[Order]table:
USE OBXKites;
SELECT Contact.LastName, Contact.FirstName FROM dbo.Contact
LEFT OUTER JOIN dbo.[Order]
ON Contact.ContactID = [Order].ContactID
WHERE [Order].OrderID IS NULL;
The result is the difference between theContacttable and the[Order]table — that is, all contacts
who have not placed an order:
-
.
The set difference query could be written using a subquery (covered in the next chapter) TheWHERE
NOT INcondition, shown in the following example, removes the subquery rows (the divisor) from the
outer query (the dividend) However, be aware that while this works logically, it doesn’t perform well
with a large data set
SELECT LastName, FirstName FROM dbo.Contact
WHERE ContactID NOT IN (SELECT ContactID FROM dbo.[Order])
ORDER BY LastName, FirstName;
Either form of the query (LEFT OUTER JOINorNOT INsubquery) works well, with very similar query
execution plans, as shown in Figure 10-13
Full set difference queries
I often use a modified version of this technique to clean up bad data during conversions A full set
dif-ference query is the logical opposite of an inner join It identifies all rows outside the intersection from
either data set by combining a full outer join with aWHERErestriction that accepts only nulls in either
primary key:
SELECT Thing1, Thing2 FROM One
Trang 2FULL OUTER JOIN Two
ON One.OnePK = Two.OnePK
WHERE Two.TwoPK IS NULL
OR One.OnePK IS NULL;
FIGURE 10-13
The subquery form of the set difference query is optimized to nearly the same query execution plan
as the left outer join solution
The result is every row without a match in theOneandTwosample tables:
Blue Thing NULL
Old Thing NULL
Trang 3Using Unions
The union operation is different from a join In relational algebra terms, a union is addition, whereas a
join is multiplication Instead of extending a row horizontally as a join would, the union stacks multiple
result sets into a single long table, as illustrated in Figure 10-14
FIGURE 10-14
A union vertically appends the result of one select statement to the result of another select
statement
Old Thing
Red Thing
New Thing
Blue Thing
Plane
Cycle
Train
Car
Table Two
Table One
Unions come in three basic flavors: union, intersect union, and difference (or except) union
Union [All]
The most common type of union by far is theUNION ALLquery, which simply adds the individual
SELECT’s results
In the followingUNIONquery, the things from tableoneand the things from tabletwoare appended
together into a single list The firstSELECTsets up the overall result, so it supplies the result set
column headers Each individualSELECTgenerates a result set for theUNIONoperation, so each
SELECT’s individualWHEREclause filters data for thatSELECT The finalSELECT’sORDER BYthen
serves as theORDER BYfor the entire unioned results set Note that theORDER BYmust refer to the
columns by either the firstSELECT’s column names, or by the ordinal position of the column:
SELECT OnePK, Thing1, ‘from One’ as Source FROM dbo.One
UNION ALL
SELECT TwoPK, Thing2, ‘from Two’
FROM dbo.Two ORDER BY Thing1;
Trang 4The resulting record set uses the column names from the firstSELECTstatement:
- -
When constructing unions, there are a few rules to understand:
■ EverySELECTmust have the same number of columns, and each column must share the same
data-type family with the columns in the other queries
■ The column names, or aliases, are determined by the firstSELECT
■ Theorder byclause sorts the results of all theSELECTs and must go on the lastSELECT,
but it uses the column names from the firstSELECT
■ Expressions may be added to theSELECTstatements to identify the source of the row so long
as the column is added to everySELECT
■ The union may be used as part of aSELECT into(a form of the insert verb covered in
Chapter 15, ‘‘Modifying Data’’), but theINTOkeyword must go in the firstSELECTstatement
■ The basicSELECTcommand defaults to all rows unlessDISTINCTis specified; the union is
the opposite By default, the union performs aDISTINCT; if you wish to change this behavior
you must specify the keywordALL (I recommend that you think of the union asUNION ALL,
in the same way that the you might think of top asTOP WITH TIES.)
Unions aren’t limited to two tables The largest I’ve personally worked with had about 90 tables (I won’t
try that again anytime soon) As long as the total number of tables referenced by a query is 256 or
fewer, SQL Server handles the load
Intersection union
An intersection union finds the rows common to both data sets An inner join finds common rows
hori-zontally, whereas an intersection union finds common rows vertically To set up the intersection query,
these first two statements add rows to tableTwoso there will be an intersection:
INSERT dbo.Two(TwoPK, OnePK, Thing2)
VALUES(5,0, ‘Red Thing’);
INSERT dbo.Two(TwoPK, OnePK, Thing2)
VALUES(?,?, ‘Blue Thing’);
SELECT Thing1
FROM dbo.One
INTERSECT
Trang 5SELECT Thing2 FROM dbo.Two ORDER BY Thing1;
Result:
Thing1 -Blue Thing Red Thing
An intersection union query is similar to an inner join The inner join merges the rows horizontally,
whereas the intersect union stacks the rows vertically The intersect must match every column in order
to be included in the result A twist, however, is that the intersect will see null values as equal and
accept the rows with nulls
Intersection union queries are very useful for proving that two queries give the same results When all three queries have the same result count, the two queries must be functionally equivalent.
Query A gives 1234 rows.
Query B gives 1234 rows.
Query A intersect Query B gives 1234 rows.
Difference union/except
The difference union is the union equivalent of the set difference query — it find rows in one data source
that are not in the other data source
Whereas a set difference query is interested only in the join conditions (typically the primary and
for-eign keys) and joins the rows horizontally, a difference unionEXCEPTquery looks at the entire row (or,
more specifically, all the columns that participate in the union’sSELECTstatements)
SQL Server uses the ANSI Standard keywordEXCEPTto execute a difference union:
SELECT Thing1 FROM dbo.One
EXCEPT
SELECT Thing2 FROM dbo.Two ORDER BY Thing1;
Result:
Thing1 -New Thing
Trang 6Merging data is the heart of SQL, and it shows in the depth of relational algebra as well as the power
and flexibility of SQL From natural joins to exotic joins, SQL is excellent at selecting sets of data from
multiple data tables
The challenge for the SQL Server database developer is to master the theory of relational algebra and the
many T-SQL techniques to effectively manipulate the data The reward is the fun
Manipulating data withSELECTis the core technology of SQL Server While joins are the most natural
method of working with relational data, subqueries open numerous possibilities for creative and
pow-erful ways to retrieve data from multiple data sources The next chapter details the many ways you can
use subqueries within a query, and introduces common table expressions (CTEs), a feature new to SQL
Server 2005
Trang 8Including Data with Subqueries and CTEs
IN THIS CHAPTER
Understanding subquery types Building simple and correlated subqueries
Fitting subqueries in the query puzzle
Using common table expressions (CTEs) Solving problems with relational division Passing data with composable SQL
SQL’s real power is its capability to mix and match multiple methods of
selecting data It’s this skill in fluidly assembling a complex query in code
to accomplish what can’t be easily done with GUI tools that differentiates
SQL gurus from the wannabes So, without hesitation I invite you to study
embedded simple and correlated subqueries, derived tables, and common table
expressions, and then apply these query components to solve complex relational
problems such as relational division
Methods and Locations
A subquery is an embedded SQL statement within an outer query The subquery
provides an answer to the outer query in the form of a scalar value, a list of
val-ues, or a data set, and may be substituted for an expression, list, or table,
respec-tively, within the outer query The matrix of subquery types andSELECT
state-ment usage is shown in Table 11-1 Traditionally, a subquery may only contain a
SELECTquery and not a data-modification query, which explains why subqueries
are sometimes referred to as subselects.
Five basic forms are possible when building a subquery, depending on the data
needs and your favored syntax:
■ Simple subquery: The simple subquery can be a stand-alone query and
can run by itself It is executed once, with the result passed to the outer
query Simple subqueries are constructed as normalSELECTqueries
and placed within parentheses
■ Common table expression (CTE): CTEs are a syntactical variation
of the simple subquery, similar to a view, which defines the subquery
at the beginning of the query using theWITHcommand The CTE can
then be accessed multiple times within the main query as if it were a
view or derived table
Trang 9■ Correlated subquery: This is similar to a simple subquery except that it references at least one column in the outer query, so it cannot run separately by itself Conceptually, the outer query runs first and the correlated subquery runs once for every row in the outer query
Physically, the Query Optimizer is free to generate an efficient query execution plan
■ Row constructor: AVALUESclause or theFROMclause that supplies hard-coded values as a subquery
■ Composable SQL: The ability to pass data from anINSERT,UPDATE, orDELETEstatement’s output clause to an outer query
TABLE 11-1
Subquery and CTE Usage
Outer Query Element Subquery Returns:
Expression List Data Set
Subquery returns a scalar value
Subquery returns
a list of values
Subquery returns
a multi-column data source
Any expression
e.g., SELECT list, HAVING clause,
GROUP BY, JOIN ON, etc
The subquery result is used as an expression supplying the value for the column If the result
is empty, NULL is used instead
Derived Table
FROM(data source) AS ALIAS
or
WITH CTE
This is the only location where a
subquery can use a table alias
The subquery’s data set
is accepted as a (one row, one column) derived table source within the outer query If the result is empty, an empty derived table source is used
The subquery’s data set is accepted as a (one row) derived table source within the outer query
The subquery’s data set is accepted as a derived table source within the outer query
WHERE x
{=,<>,!=,>,>=,!>,<,<=,!<}
(subquery)
The WHERE clause is true
if the test value compares true with the subquery’s scalar value If the subquery returns no result, the WHERE clause
is not true
Trang 10TABLE 11-1 (continued )
Outer Query Element Subquery Returns:
Expression List Data Set
Subquery returns a scalar value
Subquery returns
a list of values
Subquery returns
a multi-column data source WHERE x
{=,<>,!=,>,>=,!>,<,<=,!<}
ALL
(subquery)
The WHERE condition is true if the test value meets the condition for the scalar value returned
by the subquery If the subquery returns no result, the WHERE condition is not true
The WHERE condition is true if the test value meets the condition for every value returned by the subquery
X
WHERE x
{=,<>,!=,>,>=,!>,<,<=,!<}
SOME|ANY
(subquery)
The WHERE condition is true if the test value meets the condition for the scalar value returned
by the subquery If the subquery returns no result, the where condition is not true
The WHERE condition is true if the test value meets the condition for any value returned by the subquery
X
WHERE x
IN | = ANY
(subquery)
The WHERE condition is true if the test value is equal to the scalar value returned by the
subquery If the subquery returns no result, the WHEREcondition is not true
The WHERE condition is true if the test value is found within the list
of values returned by the subquery
X
WHERE EXISTS (Subquery) The WHERE condition is
true if the subquery returns a value
The WHERE condition is true if the subquery returns at least one value
The WHERE condition is true if the subquery returns at least one row