Here’s the complete workingMERGEcommand for the scenario: MERGE FlightPassengers F USING CheckIn C ON C.LastName = F.LastName AND C.FirstName = F.FirstName AND C.FlightCode = F.FlightCod
Trang 1The first section of the merge query identifies the target and source tables and how they relate.
Following the table definition, there’s an optional clause for each match combination, as shown in this
simplified syntax:
MERGE TargetTable USING SourceTable
ON join conditions
[WHEN Matched
THEN DML]
[WHEN NOT MATCHED BY TARGET
THEN DML]
[WHEN NOT MATCHED BY SOURCE
THEN DML]
Applying theMERGEcommand to the airline check-in scenario, there’s an appropriate action for each
match combination:
■ If the row is in bothFlightPassengers(the target) andCheckIn(the source), then the target is updated with theCheckIntable’s seat column
■ If the row is present inCheckIn(the source) but there’s no match inFlightPassenger (the target), then the row fromCheckInis inserted intoFlightPassenger Note that the data from the source table is gathered by theINSERTcommand usingINSERT .VALUES
■ If the row is present inFlightPassenger(the target), but there’s no match inCheckIn (the source), then the row is deleted fromFlightPassenger Note that theDELETE command deletes from the target and does not require aWHEREclause because the rows are filtered by theMERGEcommand
Here’s the complete workingMERGEcommand for the scenario:
MERGE FlightPassengers F USING CheckIn C
ON C.LastName = F.LastName AND C.FirstName = F.FirstName AND C.FlightCode = F.FlightCode AND C.FlightDate = F.FlightDate
WHEN Matched THEN UPDATE
SET F.Seat = C.Seat
WHEN NOT MATCHED BY TARGET THEN INSERT (FirstName, LastName, FlightCode, FlightDate, Seat)
VALUES (FirstName, LastName, FlightCode, FlightDate, Seat)
WHEN NOT MATCHED BY SOURCE THEN DELETE ;
The next query looks at the results of theMERGEcommand, returning the finalized passenger list for
SQL Server Airlines flight 2008:
SELECT FlightID, FirstName, LastName, FlightCode, FlightDate, Seat FROM FlightPassengers
Trang 2FlightID FirstName LastName FlightCode FlightDate Seat
-
MERGEhas a few specific rules:
■ It must be terminated by a semicolon
■ The rows must match one-to-one One-to-many matches are not permitted
■ The join conditions must be deterministic, meaning they are repeatable
Returning Modified Data
SQL Server can optionally return the modified data as a data set for further use This can be useful to
perform more work on the modified data, or to return the data to the front-end application to eliminate
an extra round-trip to the server
TheOUTPUTclause can access the inserted and deleted virtual tables, as well as any data source
refer-enced in theFROMclause, to select the data to be returned Normally used only by triggers, inserted
and deleted virtual tables contain the before and after views to the transaction The deleted virtual table
stores the old data, and the inserted virtual table stores the newly inserted or updated data
For more examples of the inserted and deleted table, turn to Chapter 26, ‘‘Creating DML
Triggers.’’
Returning data from an insert
TheINSERTcommand makes the inserted virtual table available The following example, taken from
earlier in this chapter, has been edited to include theOUTPUTclause The inserted virtual table has a
picture of the new data being inserted and returns the data:
USE CHA2;
INSERT dbo.Guidelist (LastName, FirstName, Qualifications)
OUTPUT Inserted.*
VALUES(‘Nielsen’, ‘Paul’,‘trainer’);
Result:
GuideID LastName FirstName Qualifications DateOfBirth DateHire
- -
Trang 3Best Practice
An excellent application of the OUTPUT clause within an INSERT is returning the values of newly
created surrogate keys The identity_scope() function returns the last single identity inserted, but
it can’t return a set of new identity values There is no function to return the GUID value just created by a
newsequentialid()default However, the OUTPUT clause returns sets of new surrogate keys regardless of
their data type You can almost think of the INSERT OUTPUT as a scope_GUID() function or a set-based
scope_identity()
Returning data from an update
TheOUTPUTclause also works with updates and can return the before and after picture of the data In
this example, the deleted virtual table is being used to grab the original value, while the inserted virtual
table stores the new updated value Only theQualificationscolumn is returned:
USE CHA2;
UPDATE dbo.Guide SET Qualifications = ‘Scuba’
OUTPUT Deleted.Qualifications as OldQuals, Inserted.Qualifications as
NewQuals Where GuideID = 3;
Result:
-
Returning data from a delete
When deleting data, only the deleted table has any useful data to return:
DELETE dbo.Guide
OUTPUT Deleted.GuideID, Deleted.LastName, Deleted.FirstName
WHERE GuideID = 3;
Result:
GuideID LastName FirstName -
Returning data from a merge
TheMERGEcommand can return data using theOUTPUTclause as well A twist is that theMERGE
command adds a column,$action, to identify whether the row was inserted, updated, or deleted from
the target table The next query adds theOUTPUTclause to the previousMERGEcommand:
Trang 4MERGE FlightPassengers F
USING CheckIn C
ON C.LastName = F.LastName
AND C.FirstName = F.FirstName
AND C.FlightCode = F.FlightCode
AND C.FlightDate = F.FlightDate
WHEN MATCHED
THEN UPDATE
SET F.Seat = C.Seat WHEN NOT MATCHED BY TARGET
THEN INSERT (FirstName, LastName, FlightCode, FlightDate, Seat)
VALUES (FirstName, LastName, FlightCode, FlightDate, Seat) WHEN NOT MATCHED BY SOURCE
THEN DELETE
OUTPUT
deleted.FlightID, deleted.LastName, Deleted.Seat,
$action,
inserted.FlightID, inserted.LastName, inserted.Seat ;
Result:
FlightID LastName Seat $action FlightID LastName Seat
- - - - -
Returning data into a table
For T-SQL developers, theOUTPUTclause can return the data for use within a batch or stored
proce-dure The data is received into a user table, temp table, or table variable, which must already have been
created Although the syntax may seem similar to theINSERT .INTOsyntax, it actually functions very
differently
In the following example, theOUTPUTclause passes the results to a@DeletedGuidestable variable:
DECLARE @DeletedGuides TABLE (
GuideID INT NOT NULL PRIMARY KEY,
LastName VARCHAR(50) NOT NULL,
FirstName VARCHAR(50) NOT NULL
);
DELETE dbo.Guide
OUTPUT Deleted.GuideID, Deleted.LastName, Deleted.FirstName
INTO @DeletedGuides
WHERE GuideID = 2;
Trang 5Interim result:
(1 row(s) affected)
Continuing the batch
SELECT GuideID, LastName, FirstName FROM @DeletedGuides;
Result:
(1 row(s) affected) GuideID LastName FirstName - -
An advance use of the OUTPUT clause, called composable DML, passes the output data to
an outer query, which can then be used in an INSERT command For more details, refer to Chapter 11, ‘‘Including Data with Subqueries and CTEs.’’
Summary
Data retrieval and data modification are primary tasks of a database application This chapter examined
the workhorseINSERT,UPDATE,DELETE, andMERGEDML commands and described how you can
use them to manipulate data
Key points in this chapter include the following:
■ There are multiple formats for theINSERTcommand depending on the data’s source:
INSERT .VALUES,INSERT .SELECT,INSERT .EXEC, andINSERT .DEFAULT
■ INSERT .VALUESnow has row constructors to insert multiple rows with a singleINSERT
■ INSERT .INTOcreates a new table and then inserts the results into the new table
■ UPDATEalways updates only a single table, but it can use an optionalFROMclause to reference other data sources
■ UsingDELETEwithout aWHEREclause is dangerous
■ The newMERGEcommand pulls data from a source table and inserts, updates, or deletes in the target table depending on the match conditions
■ INSERT,UPDATE,DELETE, andMERGEcan all include an optionalOUTPUTclause that can select data from the query or the virtual inserted and deleted tables The result of theOUTPUT clause can be passed to the client, inserted into a table, or passed to an outer query
This chapter explained data modifications assuming all goes well, but in fact several conditions and
sit-uations can conspire to block theINSERT,UPDATE,DELETE, orMERGE The next chapter looks at the
dark side of data modification and what can go wrong
Trang 6Modification Obstacles
IN THIS CHAPTER
Avoiding and solving complex data-modification problems Primary keys, foreign keys, inserts, updates, and deletes Deleting duplicate rows Nulls and defaults Trigger issues Updating with views
Some newsgroup postings ask about how to perform a task or write a
query, but another set of postings ask about troubleshooting the code
when there is some problem Typically, SQL Server is working the way
it is supposed to function, but someone is having trouble getting past what’s
perceived to be an obstacle
This chapter surveys several types of potential obstacles and explains how to
avoid them In nearly every case, the obstacle is understood — it’s really a safety
feature and SQL Server is protecting the data by blocking the insert, update, or
delete
As Table 16-1 illustrates,INSERTandUPDATEoperations face more obstacles
thanDELETEoperations because they are creating new data in the table that must
pass multiple validation rules Because theDELETEoperation only removes data,
it faces fewer possible obstacles
Data Type/Length
Column data type/length may affectINSERTandUPDATEoperations One of the
first checks the new data must pass is that of data type and data length Often,
a data-type error is caused by missing or extra quotes SQL Server is particular
about implicit, or automatic, data-type conversion Conversions that function
automatically in other programming languages often fail in SQL Server, as shown
in the following example:
USE OBXKites;
DECLARE @ProblemDate DATETIME = ‘20090301’;
INSERT dbo.Price (ProductID, Price, EffectiveDate)
VALUES (’6D37553D-89B1-4663-91BC-0486842EAD44’,
@ProblemDate, ‘20020625’);
Trang 7TABLE 16-1
Potential Data Modification Obstacles
Primary Key Constraint and Unique Constraint X X
Result:
Msg 257, Level 16, State 3, Line 3 Implicit conversion from data type datetime to money is not allowed
Use the CONVERT function to run this query
The problem with the preceding code is that aDATETIMEvariable is being inserted into a money data
type column For most data type conversions, SQL server handles the conversion implicitly; however,
conversion between some data types requires using thecast()orconvert()function
For more details about data types and tables, refer to Chapter 20, ‘‘Creating the Physical Database Schema.’’ Data-type conversion and conversion scalar functions are discussed in Chapter 9, ‘‘Data Types, Expressions, and Scalar Functions.’’
Primary Key Constraint and Unique Constraint
Both primary key constraints and unique constraints may affectINSERTandUPDATEoperations While
this section explicitly deals with primary keys, the same is true for unique indexes
Primary keys, by definition, must be unique Attempting to insert a primary key that’s already in use
will cause an error Technically speaking, updating a primary key to a value already in use also causes
Trang 8an error, but surrogate primary keys (identity columns and GUIDs) should never need to be updated;
and a good natural key should rarely need updating Candidate keys should also be stable enough that
they rarely need updating
Updating a primary key may also break referential integrity, causing the update to fail In this case,
how-ever, it’s not a primary-key constraint that’s the obstacle, but the foreign-key constraint that references
the primary key
For more information about the design of primary keys, foreign keys, and many of the
other constraints mentioned in this chapter, refer to Chapter 3, ‘‘Relational Database
Design.’’ For details on creating constraints, turn to Chapter 20, ‘‘Creating the Physical Database
Schema.’’
One particular issue related to inserting is the creation of surrogate key values for the new rows SQL
Server provides two excellent means of generating surrogate primary keys: identity columns and GUIDs.
Each method has its pros and cons, and its rules for safe handling
Every table should have a primary key If the primary key is the same data used by humans
to identify the item in reality, then it’s a natural key, e.g., ssn, vehicle vin, aircraft tail
number, part serial number.
The alternative to the natural key is the surrogate key, surrogate meaning artificial or a stand-in
replace-ment For databases, a surrogate key means an artificial, computer-generated value is used to uniquely
identify the row SQL Server supports identity columns and globally unique identifiers (GUIDs) as
surro-gate keys.
Identity columns
SQL Server automatically generates incrementing integers for identity columns at the time of the insert
and any SQLINSERTstatement normally can’t interfere with that process by supplying a value for the
identity column
The fact that identity columns refuse to accept inserted integers can be a serious issue if you’re inserting
existing data with existing primary key values that must be maintained because they are referenced by
secondary tables The solution is to use theIDENTITY_INSERTdatabase option When set to ONit
temporarily turns off the identity column and permits the insertion of data into an identity column
This means that the insert has to explicitly provide the primary-key value TheIDENTITY_INSERT
option may only be setONfor one table at a time within a database The following SQL batch uses the
IDENTITY_INSERToption when supplying the primary key:
USE CHA2;
attempt to insert into an identity column
INSERT dbo.Guide (GuideID, FirstName, LastName)
VALUES (10, ‘Bill’, ‘Fletcher’);
Result:
Server: Msg 544, Level 16, State 1, Line 1
Cannot insert explicit value for identity column in table
’Guide’ when IDENTITY_INSERT is set to OFF
Trang 9The sample database for this book can be downloaded from the book’s website:
www.sqlserverbible.com.
The next step in the batch sets theIDENTITY_INSERToption and attempts some more inserts:
SET IDENTITY_INSERT Guide ON;
INSERT Guide (GuideID, FirstName, LastName)
VALUES (100, ‘Bill’, ‘Mays’);
INSERT dbo.Guide (GuideID, FirstName, LastName)
VALUES (101, ‘Sue’, ‘Atlas’);
To see what value the identity column is now assigning, the following code re-enables the identity
col-umn, inserts another row, and then selects the new data:
SET IDENTITY_INSERT Guide OFF;
INSERT Guide ( FirstName, LastName)
VALUES ( ‘Arnold’, ‘Bistier’);
SELECT GuideID, FirstName, LastName FROM dbo.Guide;
Result:
GuideID FirstName LastName - -
As this code demonstrates, manually inserting aGuideIDof ‘‘101’’ sets the identity column’s next value
to ‘‘102.’’
Another potential problem when working with identity columns is determining the value of the identity
that was just created Because the new identity value is created with SQL Server at the time of the insert,
the code causing the insert is unaware of the identity value The insert works fine; the perceived
prob-lem occurs when the code inserts a row and then tries to display the row on a user-interface grid within
an application, because the code is unaware of the new data’s database-assigned primary key
SQL Server provides four methods for determining the identity value:
■ @@IDENTITY: This venerable global variable returns the last identity value generated by SQL Server for any table, connection, or scope If another insert takes place between the time of your insert and the time when you check@@IDENTITY,@@IDENTITYwill return not your insert, but the last insert For this reason, don’t use@@IDENTITY; it’s only there for backward
Trang 10■ SCOPE_IDENTITY (): This system function, introduced in SQL Server 2000, returns the last
generated identity value within the scope of the calling batch or procedure I recommend using
this method, as it is the safest way to determine the identity value you last generated
■ IDENT_CURRENT (TABLE): This function, also introduced in SQL Server 2000, returns
the last identity value per table While this option seems similar toSCOPE_IDENTITY(),
IDENT_CURRENT()returns the identity value for the given table regardless of inserts to any
other tables that may have occurred This prevents another insert, buried deep within a trigger,
from affecting the identity value returned by the function
■ OUTPUTclause: TheINSERT,UPDATE,DELETE, andMERGEcommands can include an
OUTPUTclause that can select from the inserted and deleted virtual tables Using this data, any
data modification query can return the inserted identity values
Globally unique identifiers (GUIDs)
Globally unique identifiers (GUIDs) are sometimes, and with great debate, used as primary keys A
GUID can be the best choice when you have to generate unique values at different locations (i.e., in
replicated scenarios), but hardly ever otherwise
With regard to the insertion of new rows, the major difference between identity columns and GUIDs
is that GUIDs are generated by the SQL code or by a column default, rather than automatically
gener-ated by the engine at the time of the insert This means that the developer has more control over GUID
creation
There are five ways to generate GUID primary key values when inserting new rows:
■ TheNEWID()function can create the GUID in T-SQL code prior to theINSERT
■ TheNEWID()function can create the GUID in client code prior to theINSERT
■ TheNEWID()function can create the GUID in an expression in the INSERTcommand
■ TheNEWID()function can create the GUID in a column default
■ TheNEWSEQUENTIALID()function can create the GUID in a column default This is the
only method that avoids the page split performance issues with GUIDs If you must use a
GUID, then I strongly recommend usingNEWSEQUENTIALID()
The following sample code demonstrates various methods of generating GUID primary keys during the
addition of new rows to the ProductCategorytable in theOBXKitesdatabase The first query
sim-ply tests theNEWID()function:
USE OBXKites;
Select NewID();
Result:
5CBB2800-5207-4323-A316-E963AACB6081
The next three queries insert a GUID, each using a different method of generating the GUID:
GUID from Default (the columns default is NewID())
INSERT dbo.ProductCategory
(ProductCategoryID, ProductCategoryName)