Hướng dẫn học Microsoft SQL Server 2008 part 41 pot

The first step performs the date math; it selects the data required for the raise calculation, assuming June 25, 2009, is the effective date of the raise, and ensures the performance rat

Trang 1

DatePosition DATE NOT NULL )

INSERT dbo.Dept (DeptName, RaiseFactor) VALUES (’Engineering’, 1.2),

(’Sales’, 8), (’IT’, 2.5), (’Manufacturing’, 1.0) ; INSERT dbo.Employee (DeptID, LastName, FirstName,

Salary, PerformanceRating, DateHire, DatePosition) VALUES (1, ‘Smith’, ‘Sam’, 54000, 2.0, ‘19970101’, ‘19970101’),

(1, ‘Nelson’, ‘Slim’, 78000, 1.5, ‘19970101’, ‘19970101’), (2, ‘Ball’, ‘Sally’, 45000, 3.5, ‘19990202’, ‘19990202’), (2, ‘Kelly’, ‘Jeff’, 85000, 2.4, ‘20020625’, ‘20020625’), (3, ‘Guelzow’, ‘Jo’, 120000, 4.0, ‘19991205’, ‘19991205’), (3, ‘Ander’, ‘Missy’, 95000, 1.8, ‘19980201’, ‘19980201’), (4, ‘Reagan’, ‘Sam’, 75000, 2.9, ‘20051215’, ‘20051215’), (4, ‘Adams’, ‘Hank’, 34000, 3.2, ‘20080501’, ‘20080501’);

When developing complex queries, I work from the inside out The first step performs the date math;

it selects the data required for the raise calculation, assuming June 25, 2009, is the effective date of the

raise, and ensures the performance rating won’t count if it’s only 1:

SELECT EmployeeID, Salary,

CAST(CAST(DATEDIFF(d, DateHire, ‘20090625’)

AS DECIMAL(7, 2)) / 365.25 AS INT)

AS YrsCo,

CAST(CAST(DATEDIFF(d, DatePosition, ‘20090625’)

AS DECIMAL(7, 2)) / 365.25

* 12 AS INT)

AS MoPos, CASE WHEN Employee.PerformanceRating >= 2

THEN Employee.PerformanceRating ELSE 0

END AS Perf, Dept.RaiseFactor FROM dbo.Employee JOIN dbo.Dept

ON Employee.DeptID = Dept.DeptID Result:

EmployeeID Salary YrsCo MoPos Perf RaiseFactor - - - - -

Trang 2

4 85000.00 7 84 2.40 0.80

The next step in developing this query is to add the raise calculation The simplest way to see the

calcu-lation is to pull the values already generated from a subquery:

(2 + ((YearsCompany * 1) + (MonthPosition * 02)

+ (Performance * 5)) * RaiseFactor) / 100 AS EmpRaise

FROM (SELECT EmployeeID, FirstName, LastName, Salary,

CAST(CAST(DATEDIFF(d, DateHire, ‘20090625’) AS DECIMAL(7, 2)) / 365.25 AS INT) AS YearsCompany, CAST(CAST(DATEDIFF(d, DatePosition, ‘20090625’) AS DECIMAL(7, 2)) / 365.25 * 12 AS INT) AS MonthPosition, CASE WHEN Employee.PerformanceRating >= 2

END AS Performance, Dept.RaiseFactor FROM dbo.Employee

JOIN dbo.Dept

ON Employee.DeptID = Dept.DeptID) AS SubQuery Result:

EmployeeID Salary EmpRaise

- -

5 120000.00 0.149500000

The last query was relatively easy to read, but there’s no logical reason for the subquery The query

could be rewritten combining the date calculations and the case expression into the raise formula:

(2 +

years with company

+ ((CAST(CAST(DATEDIFF(d, DateHire, ‘20090625’)

AS DECIMAL(7, 2)) / 365.25 AS INT) * 1) months in position

+ (CAST(CAST(DATEDIFF(d, DatePosition, ‘20090625’)

AS DECIMAL(7, 2)) / 365.25 * 12 AS INT) * 02)

Trang 3

Performance Rating minimum + (CASE WHEN Employee.PerformanceRating >= 2

END * 5)) Raise Factor

* RaiseFactor) / 100 AS EmpRaise FROM dbo.Employee

JOIN dbo.Dept

ON Employee.DeptID = Dept.DeptID It’s easy to verify that this query gets the same result, but which is the better query? From a

perfor-mance perspective, both queries generate the exact same query execution plan When considering

maintenance and readability, I’d probably go with the second query carefully formatted and commented

The final step is to convert the query into anUPDATEcommand The hard part is already done — it

just needs theUPDATEverb at the front of the query:

UPDATE Employee SET Salary = Salary *

(1 + ((2 years with company + ((CAST(CAST(DATEDIFF(d, DateHire, ‘20090625’)

AS DECIMAL(7, 2)) / 365.25 AS INT) * 1) months in position

+ (CAST(CAST(DATEDIFF(d, DatePosition, ‘20090625’)

AS DECIMAL(7, 2)) / 365.25 * 12 AS INT) * 02) Performance Rating minimum

+ (CASE WHEN Employee.PerformanceRating >= 2

END * 5)) Raise Factor

* RaiseFactor) / 100 )) FROM dbo.Employee

JOIN dbo.Dept

ON Employee.DeptID = Dept.DeptID

A quick check of the data confirms that the update was successful:

SELECT FirstName, LastName, Salary FROM dbo.Employee

Result:

FirstName LastName Salary -

Trang 4

Slim Nelson 83472.48

Missy Anderson 105972.50

The final step of the exercise is to clean up the sample tables:

DROP TABLE dbo.Employee, dbo.Dept;

This sample code pulls together techniques from many of the previous chapters: creating and dropping

tables,CASEexpressions, joins, and date scalar functions, not to mention the inserts and updates from

this chapter The example is long because it demonstrates more than just theUPDATEstatement It also

shows the typical process of developing a complexUPDATE, which includes the following:

1 Checking the available data: The firstSELECTjoinsemployeeanddept, and lists all the

columns required for the formula

2 Testing the formula: The secondSELECTis based on the initialSELECTand assembles the

formula from the required rows From this data, a couple of rows can be hand-tested against

the specs, and the formula verified

3 Performing the update: Once the formula is constructed and verified, the formula is edited

into anUPDATEstatement and executed

The SQLUPDATEcommand is powerful I have replaced terribly complex record sets and nested loops

that were painfully slow and error-prone withUPDATEstatements and creative joins that worked

well, and I have seen execution times reduced from hours to a few seconds I cannot overemphasize

the importance of approaching the selection and updating of data in terms of data sets, rather than

data rows

Deleting Data

TheDELETEcommand is dangerously simple In its basic form, it deletes all the rows from a table

Because theDELETEcommand is a row-based operation, it doesn’t require specifying any column

names The firstFROMis optional, as are the secondFROMand theWHEREconditions However,

although theWHEREclause is optional, it is the primary subject of concern when you’re using the

DELETEcommand Here’s an abbreviated syntax for theDELETEcommand:

DELETE [FROM] schema.Table

[FROM data sources]

[WHERE condition(s)];

Notice that everything is optional except the actualDELETEcommand and the table name The

following command would delete all data from theProducttable — no questions asked and no

second chances:

Trang 5

DELETE FROM OBXKites.dbo.Product;

SQL Server has no inherent ‘‘undo’’ command Once a transaction is committed, that’s it That’s why the

WHEREclause is so important when you’re deleting

By far, the most common use of theDELETEcommand is to delete a single row The primary key is

usually the means of selecting the row:

USE OBXKites;

DELETE FROM dbo.Product WHERE ProductID = ‘DB8D8D60-76F4-46C3-90E6-A8648F63C0F0’;

Referencing multiple data sources while deleting

There are two techniques for referencing multiple data sources while deleting rows: the doubleFROM

clause and subqueries

TheUPDATEcommand uses theFROMclause to join the updated table with other tables for more

flexi-ble row selection TheDELETEcommand can use the exact same technique When using this method,

the first optionalFROMcan make it look confusing To improve readability and consistency, I

recom-mend that you omit the firstFROMin your code

For example, the followingDELETEstatement ignores the firstFROMclause and uses the secondFROM

clause to joinProductwithProductCategoryso that theWHEREclause can filter theDELETE

based on theProductCategoryName This query removes all videos from theProducttable:

DELETE dbo.Product FROM dbo.Product

JOIN dbo.ProductCategory

ON Product.ProductCategoryID

= ProductCategory.ProductCategoryID

WHERE ProductCategory.ProductCategoryName = ‘Video’;

The second method looks more complicated at first glance, but it’s ANSI standard and the preferred

method A correlated subquery actually selects the rows to be deleted, and theDELETEcommand just

picks up those rows for the delete operation It’s a very clean query:

DELETE FROM dbo.Product WHERE EXISTS

(SELECT * FROM dbo.ProductCategory AS pc WHERE pc.ProductCategoryID = Product.ProductCategoryID AND pc.ProductCategoryName = ‘Video’);

Trang 6

It terms of performance, both methods generate the exact same query execution plan.

As with the UPDATE command’s FROM clause, the DELETE command’s second FROM clause is

not an ANSI SQL standard If portability is important to your project, then use a subquery

to reference additional tables.

Cascading deletes

Referential integrity (RI) refers to the idea that no secondary row foreign key should point to a primary

row primary key unless that primary row does in fact exist This means that an attempt to delete a

pri-mary row will fail if a foreign-key value somewhere points to that pripri-mary row

For more information about referential integrity and when to use it, turn to Chapter 3,

‘‘Relational Database Design,’’ and Chapter 20, ‘‘Creating the Physical Database Schema.’’

When implemented correctly, referential integrity will block any delete operation that would result in a

foreign key value without a corresponding primary key value The way around this is to first delete the

secondary rows that point to the primary row, and then delete the primary row This technique is called

a cascading delete In a complex database schema, the cascade might bounce down several levels before

working its way back up to the original row being deleted

There are two ways to implement a cascading delete: manually with triggers or automatically with

declared referential integrity (DRI) via foreign keys.

Implementing cascading deletes manually is a lot of work Triggers are significantly slower than foreign

keys (which are checked as part of the query execution plan), and trigger-based cascading deletes

usu-ally also handle the foreign key checks While this was commonplace a decade ago, today trigger-based

cascading deletes are very rare and might only be needed with a very complex nonstandard foreign key

design that includes business rules in the foreign key If you’re doing that, then you’re either very new

at this or very, very good

Fortunately, SQL Server offers cascading deletes as a function of the foreign key Cascading deletes may

be enabled via Management Studio, in the Foreign Key Relationship dialog, or in SQL code

The sample script that creates theCape Hatteras Adventures version 2database

(CHA2_Create.sql) provides a good example of setting the cascade-delete option for referential

integrity In this case, if either the event or the guide is deleted, then the rows in the event-guide

many-to-many table are also deleted TheON DELETE CASCADEforeign-key option is what actually

specifies the cascade action:

CREATE TABLE dbo.Event_mm_Guide (

EventGuideID

INT IDENTITY NOT NULL PRIMARY KEY,

EventID

INT NOT NULL

FOREIGN KEY REFERENCES dbo.Event ON DELETE CASCADE,

GuideID

INT NOT NULL

FOREIGN KEY REFERENCES dbo.Guide ON DELETE CASCADE,

LastName

Trang 7

VARCHAR(50) NOT NULL, )

ON [PRIMARY];

As a caution, cascading deletes, or even referential integrity, are not suitable for every relationship It

depends on the permanence of the secondary row If deleting the primary row makes the secondary row

moot or meaningless, then cascading the delete makes good sense; but if the secondary row is still a

valid row after the primary row is deleted, then referential integrity and cascading deletes would cause

the database to break its representation of reality

As an example of determining the usefulness of cascading deletes from theCape Hatteras

Adventuresdatabase, consider that if a tour is deleted, then all scheduled events for that tour become

meaningless, as do the many-to-many schedule tables between event and customer, and between

event and guide Conversely, a tour must have a base camp, so referential integrity is required on the

Tour.BaseCampIDforeign key However, if a base camp is deleted, then the tours originating from

that base camp might still be valid (if they can be rescheduled to another base camp), so cascading a

base-camp delete down to the tour is not a reasonable action If RI is on and cascading deletes are off,

then a base camp with tours cannot be deleted until all tours for that base camp are either manually

deleted or reassigned to other base camps

Alternatives to physically deleting data

Some database developers choose to completely avoid deleting data Instead, they build systems to

remove the data from the user’s view while retaining the data for safekeeping (likedBase][did) This

can be done in several different ways:

■ A logical-deletebitflag, or nullableMomentDeletedcolumn, in the row can indicate that the row is deleted This makes deleting or restoring a single row a straightforward matter

of setting or clearing a bit However, because a relational database involves multiple related tables, there’s more work to it than that All queries must check the logical-delete flag and filter out logically deleted rows This means that a bit column (with extremely poor selectivity)

is probably an important index for every query While SQL Server 2008’s new filtered indexes are a perfect fit, it’s still a performance killer

■ To make matters worse, because the rows still physically exist in SQL Server, and SQL Server’s declarative referential integrity does not know about the logical-delete flag, custom referential integrity and cascading of logical delete flags are also required Restoring, or undeleting, cascaded logical deletes can become a nightmare

■ The cascading logical deletes method is complex to code and difficult to maintain This is a case of complexity breeding complexity, and I no longer recommend this method

■ Another alternative to physically deleting rows is to archive the deleted rows in an archive or audit table This method is best implemented by anINSTEAD OFtrigger that copies the data

to the alternative location and then physically deletes the rows from the production database

■ This method offers several advantages Data is physically removed from the database, so there’s no need to artificially modifySELECTqueries or index on a bit column Physically removing the data enables SQL Server referential integrity to remain in effect In addition, the database is not burdened with unnecessary data Retrieving archived data remains relatively straightforward and can be easily accomplished with a view that selects data from the archive location

Trang 8

Chapter 53, ‘‘Data Audit Triggers,’’ details how to automatically generate the audit system

discussed here that stores, views, and recovers deleted rows.

Merging Data

An upsert operation is a logical combination of an insert and an update If the data isn’t already in the

table, the upsert inserts the data; if the data is already in the table, then the upsert updates with the

dif-ferences Ignoring for a moment the newMERGEcommand in SQL Server 2008, there are a few ways to

code an upsert operation with T-SQL:

■ The most common method is to attempt to locate the data with anIF EXISTS; and if the row

was found,UPDATE, otherwiseINSERT

■ If the most common use case is that the row exists and theUPDATEwas needed, then the best

method is to do the update, and if@@RowCount = 0, then the row was new and the insert

should be performed

■ If the overwhelming use case is that the row would be new to the database, thenTRYto

INSERTthe new row; if a unique index blocked theINSERTand fired an error, thenCATCH

the error andUPDATEinstead

All three methods are potentially obsolete with the newMERGEcommand TheMERGEcommand is very

well done by Microsoft — it solves a complex problem well with a clean syntax and good performance

First, it’s called ‘‘merge’’ because it does more than an upsert Upsert only inserts or updates; merge can

be directed to insert, update, and delete all in one command

In a nutshell,MERGEsets up a join between the source table and the target table, and can then perform

operations based on matches between the two tables

To walk through a merge scenario, the following example sets up an airline flight check-in scenario The

main work table isFlightPassengers, which holds data about reservations It’s updated as travelers

check in, and by the time the flight takes off, it has the actual final passenger list and seat assignments

In the sample scenario, four passengers are scheduled to fly SQL Server Airlines flight 2008 (Denver to

Seattle) on March 1, 2008 Poor Jerry, he has a middle seat on the last row of the plane — the row that

doesn’t recline:

USE tempdb;

Merge Target Table

CREATE TABLE FlightPassengers (

FlightID INT NOT NULL

IDENTITY PRIMARY KEY, LastName VARCHAR(50) NOT NULL,

FirstName VARCHAR(50) NOT NULL,

FlightCode CHAR(6) NOT NULL,

FlightDate DATE NOT NULL,

Seat CHAR(3) NOT NULL

Trang 9

INSERT FlightPassengers

(LastName, FirstName, FlightCode, FlightDate, Seat) VALUES (‘Nielsen’, ‘Paul’, ‘SS2008’, ‘20090301’, ‘9F’),

(‘Jenkins’, ‘Sue’, ‘SS2008’, ‘20090301’, ‘7A’), (‘Smith’, ‘Sam’, ‘SS2008’, ‘20090301’, ‘19A’), (‘Nixon’, ‘Jerry’, ‘SS2008’, ‘20090301’, ‘29B’);

The day of the flight, the check-in counter records all the passengers as they arrive, and their seat

assignments, in theCheckIntable One passenger doesn’t show, a new passenger buys a ticket, and

Jerry decides today is a good day to burn an upgrade coupon:

Merge Source table

CREATE TABLE CheckIn (

LastName VARCHAR(50), FirstName VARCHAR(50), FlightCode CHAR(6), FlightDate DATE, Seat CHAR(3) );

INSERT CheckIn (LastName, FirstName, FlightCode, FlightDate, Seat)

VALUES (‘Nielsen’, ‘Paul’, ‘SS2008’, ‘20090301’, ‘9F’),

(‘Jenkins’, ‘Sue’, ‘SS2008’, ‘20090301’, ‘7A’), (‘Nixon’, ‘Jerry’, ‘SS2008’, ‘20090301’, ‘2A’), (‘Anderson’, ‘Missy’, ‘SS2008’, ‘20090301’, ‘4B’);

Before theMERGEcommand is executed, the next three queries look for differences in the data

The first set-difference query returns any no-show passengers ALEFT OUTER JOINbetween the

FlightPassengersandCheckIntables finds every passenger with a reservation joined with their

CheckInrow if the row is available If noCheckInrow is found, then theLEFT OUTER JOINfills

in theCheckIncolumn with nulls Filtering for the null returns only those passengers who made a

reservation but didn’t make the flight:

NoShows SELECT F.FirstName + ‘ ’ + F.LastName AS Passenger, F.Seat FROM FlightPassengers AS F

LEFT OUTER JOIN CheckIn AS C

ON C.LastName = F.LastName AND C.FirstName = F.FirstName AND C.FlightCode = F.FlightCode AND C.FlightDate = F.FlightDate

WHERE C.LastName IS NULL

Result:

-

Trang 10

The walk-up check-in query uses aLEFT OUTER JOINand anIS NULLin theWHEREclause to locate

any passengers who are in theCheckIntable but not in theFlightPassengertable:

Walk Up CheckIn

SELECT C.FirstName + ‘ ’ + C.LastName AS Passenger, C.Seat

FROM CheckIn AS C

LEFT OUTER JOIN FlightPassengers AS F

ON C.LastName = F.LastName

AND C.FirstName = F.FirstName AND C.FlightCode = F.FlightCode AND C.FlightDate = F.FlightDate

WHERE F.LastName IS NULL

Result:

-

The last difference query lists any seat changes, including Jerry’s upgrade to first class This query uses

an inner join because it’s searching for passengers who both had previous seat assignments and now are

boarding with a seat assignment The query compares theseatcolumns from theFlightPassenger

andCheckIntables using a not equal comparison, which finds any passengers with a different seat

than previously assigned Go Jerry!

Seat Changes

SELECT C.FirstName + ‘ ’ + C.LastName AS Passenger, F.Seat AS

‘previous seat’, C.Seat AS ‘final seat’

FROM CheckIn AS C

INNER JOIN FlightPassengers AS F

ON C.LastName = F.LastName

AND C.FirstName = F.FirstName AND C.FlightCode = F.FlightCode AND C.FlightDate = F.FlightDate AND C.Seat <> F.Seat

WHERE F.Seat IS NOT NULL

Result:

Passenger previous seat final seat

- -

For another explanation of set difference queries, flip over to Chapter 10, ‘‘Merging Data

with Joins and Unions.’’

With the scenario’s data in place and verified with set-difference queries, it’s time to merge the check-in

data into theFlightPassengertable

Định dạng
Số trang	10
Dung lượng	513,38 KB