This version number may be viewed using a function: SELECT Change_tracking_current_version; Result: 0 The current version number is the number of the latest Change Tracking version store
Trang 1Enabling all tables
Enabling every table in a large database for Change Tracking can be cumbersome — scripting the
ALTERcommand for every table Fortunately,sp_MSforeachtable, an undocumented Microsoft
stored procedure, is the salve that binds the wound
sp_MSforeachtableexecutes like a cursor, executing a command, enclosed in single quotes, for
every table in the current database The?placeholder is replaced with theschema.tablename
for every table If an error occurs, then it’s reported in the message pane, butsp_MSforeachtable
trudges along with the next table
This script enables Change Tracking for every table in the current database:
EXEC sp_MSforeachtable
‘ALTER TABLE ? Enable Change_tracking With (track_columns_updated = on);’;
Internal tables
Change Tracking stores its data in internal tables There’s no reason to directly query these tables to
use Change Tracking However, it is useful to look at the space used by these tables when considering
the cost of using Change Tracking and to estimate disk usage
Querysys.internal_tablesto find the internal tables Of course, your Change Tracking table(s)
will have a different name:
SELECT s.name + ‘.’ + o.name as [table], i.name as [ChangeTracking],
ct.is_track_columns_updated_on, ct.min_valid_version,
ct.begin_version, ct.cleanup_version FROM sys.internal_tables i
JOIN sys.objects o
ON i.parent_id = o.object_id JOIN sys.schemas s
ON o.schema_id = s.schema_id JOIN sys.change_tracking_tables ct
ON o.object_id = ct.object_id WHERE i.name LIKE ‘change_tracking%’
ORDER BY [table]
Result (abbreviated):
- -HumanResources.Department sys.change_tracking_757577737 Armed with the name, it’s easy to find the disk space used Because Change Tracking was just enabled
in this database, the internal table is still empty:
Trang 2EXEC sp_spaceused ‘sys.change_tracking_757577737’
Result:
name rows reserved data index_size unused
- - -
-change_tracking_757577737 0 0 KB 0 KB 0 KB 0 KB
This query combines the Change Tracking configuration with the internal name:
SELECT s.name + ‘.’ + o.name as [table],
i.name as [ChangeTracking],
ct.is_track_columns_updated_on,
ct.min_valid_version,
ct.begin_version, ct.cleanup_version
FROM sys.internal_tables i
JOIN sys.objects o
ON i.parent_id = o.object_id
JOIN sys.schemas s
ON o.schema_id = s.schema_id
JOIN sys.change_tracking_tables ct
ON o.object_id = ct.object_id
WHERE i.name LIKE ‘change_tracking%’
ORDER BY [table]
Querying Change Tracking
Once Change Tracking is enabled for a table, SQL Server begins to store information about which rows
have changed This data may be queried to select only the changed data from the source table — perfect
for synchronization
Version numbers
Key to understanding Change Tracking is that Change Tracking numbers every transaction with a
database-wide version number, which becomes important when working with the changed data This
version number may be viewed using a function:
SELECT Change_tracking_current_version();
Result:
0
The current version number is the number of the latest Change Tracking version stored by Change
Tracking, so if the current version is 5, then there is a version 5 in the database, and the next
transaction will be version 6
Trang 3The following code makes inserts and updates to theHumanResources.Departmenttable while
watching the Change Tracking version number:
INSERT HumanResources.Department (Name, GroupName) VALUES (‘CT New Row’, ‘SQL Rocks’),
(‘Test Two’ , ‘SQL Rocks’);
SELECT Change_tracking_current_version();
Result:
1 The inserts added two new rows, with primary key values ofDepartmentID 17and18
And now an update:
UPDATE HumanResources.Department SET Name = ‘Changed Name’
WHERE Name = ‘CT New Row’;
The update affected rowDepartmentID = 17
Testing the Change Tracking version shows that it has been incremented to 2:
SELECT Change_tracking_current_version();
Result:
2 The version number is critical to queryingChangeTable(explained in the next section), and it must
be within the range of the oldest possible version number for a given table and the current database
version number The old data is probably being cleaned up automatically, so the oldest possible version
number will likely vary for each table
The following query can report the valid version number range for any table In this case, it returns the
current valid queryable range forHumanResources.Department:
SELECT
Change_tracking_min_valid_version
(Object_id(N‘HumanResources.Department’)) as ‘oldest’,
Change_tracking_current_version() as ‘current’;
Result:
Trang 4
Changes by the row
Here’s where Change Tracking shows results The primary keys of the rows that have been modified
since (or after) a given version number can be found by querying theChangeTabletable-valued
function, passing to it the Change Tracking table and a beginning version number For example,
passing tableXYZand version number10toChangeTablewill return the changes for version11
and following that were made to tableXYZ Think of the version number as the number of the last
synchronization, so this synchronization needs all the changes after the last synchronization
In this case, the Change Tracking table isHumanResources.Departmentand the beginning version
is 0:
SELECT *
FROM ChangeTable
(Changes HumanResources.Department, 0) as CT;
Result:
SYS
CHANGE CREATION CHANGE CHANGE CHANGE
VERSION VERSION OPERATION COLUMNS CONTEXT DepartmentID
- - - - -
Since version number0, two rows have been inserted The update to row17is still reported as an
insert because, for the purposes of synchronization, row17must be inserted
If version number1is passed toChangeTable, then the result should show only change version2:
SELECT *
FROM ChangeTable
(Changes HumanResources.Department, 1) as CT;
Result (formatted to include thesyschangecolumnsdata):
SYS
CHANGE CREATION CHANGE CHANGE CHANGE
VERSION VERSION OPERATION COLUMNS CONTEXT DepartmentID
- - - - -
NULL This time row17shows up as an update, because when version2occurred, row17already existed,
and version2updated the row A synchronization based on changes made since version 1 would need
to update row17
Note that as a table-valued function,ChangeTablemust have an alias
Trang 5Synchronizing requires joining with the source table The following query reports the changed rows
fromHumanResources.Departmentsince version1 The left outer join is necessary to pick up any
deleted rows which, by definition, no longer exist in the source table and would therefore be missed by
an inner join:
SELECT CT.SYS_CHANGE_VERSION as Version, CT.DepartmentID, CT.SYS_CHANGE_OPERATION as Op, d.Name, d.GroupName
FROM ChangeTable (Changes HumanResources.Department, 1) as CT
LEFT OUTER JOIN HumanResources.Department d
ON d.DepartmentID = CT.DepartmentID ORDER BY CT.SYS_CHANGE_VERSION;
Result:
Version DepartmentID Op Name GroupName - - -
As expected, the result shows row 17 being updated, so there’s no data other than the primary key
returned by theChangeTabledata source The join pulls in the data fromHumanResources
.Department
Coding a synchronization
Knowing which rows have been changed means that it should be easy to merge those changes into a
synchronization table The trick is synchronizing a set of data while changes are still being made at the
source, without locking the source
Assuming the previous synchronization was at version 20, and the current version is 60, then 20
is passed toChangeTable But what becomes the new current version? The current version just
before theChangeTableis queried and the data is merged? What if more changes occur during the
synchronization?
The new SQL Server 2008MERGEcommand would seem to be the perfect solution It does support
the output clause If the version is stored in the synchronization target table, then the output clause’s
inserted table can return the insert and update operation new versions, and the max() versions can be
determined But deletion operations return only the deleted virtual table, which would return the version
number of the last change made to the deleted row, and not the version number of the deletion event
The solution is to capture all theChangeTabledata to a temp table, determine the max version
num-ber for that synchronization set, store that version numnum-ber, and then perform the synchronization merge
As much as I hate temp tables, it’s the only clean solution
The following script sets up a synchronization fromHumanResources.Departmentto
HRDeptSynch Synchronization typically occurs from one device to another, or one database to
another Here,AdventureWorks2008is the source database, andtempdbwill serve as the target
database Assume thetempdb.dbo.HRDeptSynchtable was last synchronized before any changes
were made toAdventureWorks2008.HumanResources.Departmentin this chapter By including
the database name in the code, there’s no need to issue aUSE DATABASEcommand:
Trang 6create synch master version table
CREATE TABLE Tempdb.dbo.SynchMaster (
TableName SYSNAME,
LastSynchVersion INT,
SynchDateTime DATETIME
)
initialize for HRDeptSynch
INSERT Tempdb.dbo.SynchMaster (TableName, LastSynchVersion)
VALUES (‘HRDeptSynch’, 0)
create target table
CREATE TABLE Tempdb.dbo.HRDeptSynch (
DepartmentID SmallINT,
Name NVARCHAR(50),
GroupName NVARCHAR(50),
Version INT
)
Populate Synch table with baseline original data
INSERT Tempdb.dbo.HRDeptSynch (DepartmentID, Name, GroupName)
SELECT DepartmentID, Name, GroupName FROM HumanResources.Department;
Another good idea in this process is to check
Check Change_tracking_min_valid_version
(Object_id(N‘HumanResources.Department’)) as ‘oldest’
to verify that the synchronization won’t miss cleaned-up data
The following stored procedure uses Change Tracking, a synch master table, a temp table, and the
new SQL ServerMERGEcommand to synchronize any changes in the source table (HumanResources
.Department) into the target table (Tempdb.dbo.HRDeptSynch):
USE AdventureWorks2008;
CREATE PROC pHRDeptSynch
AS
SET NoCount ON;
DECLARE
@LastSynchMaster INT,
@ThisSynchMaster INT;
CREATE TABLE #HRDeptSynch (
Version INT,
Op CHAR(1),
DepartmentID SmallINT,
Name NVARCHAR(50),
Trang 7GroupName NVARCHAR(50) );
SELECT @LastSynchMaster = LastSynchVersion FROM Tempdb.dbo.SynchMaster
WHERE TableName = ‘HRDeptSynch’;
INSERT #HRDeptSynch (Version, Op, DepartmentID, Name, GroupName) SELECT CT.SYS_CHANGE_VERSION as Version, CT.SYS_CHANGE_OPERATION as Op, CT.DepartmentID, d.Name, d.GroupName FROM ChangeTable
(Changes HumanResources.Department, @LastSynchMaster)
as CT LEFT OUTER JOIN HumanResources.Department d
ON d.DepartmentID = CT.DepartmentID ORDER BY CT.SYS_CHANGE_OPERATION;
MERGE INTO Tempdb.dbo.HRDeptSynch as Target USING
(SELECT Version, Op, DepartmentID, Name, GroupName FROM #HRDeptSynch)
AS Source (Version, Op, DepartmentID, Name, GroupName)
ON Target.DepartmentID = Source.DepartmentID WHEN NOT MATCHED AND Source.Op = ‘I’
THEN INSERT (DepartmentID, Name, GroupName) VALUES (DepartmentID, Name, GroupName)
WHEN MATCHED AND Source.Op = ‘U’
THEN UPDATE
SET Name = Source.Name, GroupName = Source.GroupName WHEN MATCHED AND Source.Op = ‘D’
THEN DELETE;
UPDATE Tempdb.dbo.SynchMaster SET LastSynchVersion = (SELECT Max(Version) FROM #HRDeptSynch), SynchDateTime = GETDATE()
WHERE TableName = ‘HRDeptSynch’;
Go
To put the stored procedure through its paces, the following script makes several modifications to the
source table and callspHRDeptSynch:
INSERT HumanResources.Department (Name, GroupName)
VALUES (‘Row Three’, ‘Data Rocks!’),
(‘Row Four’ , ‘SQL Rocks!’);
Trang 8UPDATE HumanResources.Department
SET GroupName = ‘SQL Server 2008 Bible’
WHERE Name = ‘Test Two’;
EXEC pHRDeptSynch;
DELETE FROM HumanResources.Department
WHERE Name = ‘Row Four’;
EXEC pHRDeptSynch;
EXEC pHRDeptSynch;
DELETE FROM HumanResources.Department
WHERE Name = ‘Test Two’;
EXEC pHRDeptSynch;
To test the results, the next two queries search for out of synch conditions The first query uses a
set-difference query with aFULL OUTER JOINand twoIS NULLs to find any mismatched rows on
either side of the join:
check for out-of-synch rows:
SELECT *
FROM HumanResources.Department Source
FULL OUTER JOIN tempdb.dbo.HRDeptSynch Target
ON Source.DepartmentID = Target.DepartmentID
WHERE Source.DepartmentID IS NULL
OR Target.DepartmentID IS NULL
There is no result set
The second verification query simply joins the tables and compares the data columns in theWHERE
clause to return any rows with mismatched data:
Check for out-of-synch data
SELECT *
FROM HumanResources.Department Source
LEFT OUTER JOIN tempdb.dbo.HRDeptSynch Target
ON Source.DepartmentID = Target.DepartmentID
WHERE Source.Name != Target.Name
OR Source.GroupName != Target.GroupName
There is no result set
Good The Change Tracking and the synchronization stored procedure worked — and the stored
version number is absolutely the correct version number for the next synchronization
To check the versions, the next two queries look at Change Tracking’s current version and the version
stored inSynchMaster:
Trang 9SELECT Change_tracking_current_version();
Result:
6 SELECT * FROM tempdb.dbo.SynchMaster;
Result:
TableName LastSynchMaster SynchDateTime - - -HRDeptSynch 6 2009-01-16 18:00:42.643 Although lengthy, this exercise showed how to leverage Change Tracking and the newMERGEcommand
to build a complete synchronization system
Change Tracking Options
It’s completely reasonable to use only theChangeTablefunction to design a Change Tracking system,
but three advanced options are worth exploring
Column tracking
If Change Tracking was enabled for the table with thetrack_columns_updatedoption on (it’s off
by default), then SQL Server stores which columns are updated in a bitmap that costs four bytes per
changed column (to store the column’scolumn_id) TheCHANGE_TRACKING_IS_COLUMN_IN_MASK
function returns a Boolean true if the column was updated It requires two parameters: the column’s
column_idand the bit-mapped column The bit-mapped column that actually stored the data is the
SYS_CHANGED_COLUMNScolumn in theChangeTablerow The following query demonstrates the
function, and the easiest way to pass in thecolumn_id:
SELECT CT.SYS_CHANGE_VERSION, CT.DepartmentID, CT.SYS_CHANGE_OPERATION, d.Name, d.GroupName, d.ModifiedDate,
CHANGE_TRACKING_IS_COLUMN_IN_MASK(
ColumnProperty(
Object_ID(‘HumanResources.Department’),
‘Name’, ‘ColumnID’),
SYS_CHANGE_COLUMNS) as IsChanged_Name,
CHANGE_TRACKING_IS_COLUMN_IN_MASK(
ColumnProperty(
Object_ID(‘HumanResources.Department’),
‘GroupName’, ‘ColumnID’),
SYS_CHANGE_COLUMNS) as IsChanged_GroupName
FROM ChangeTable (Changes HumanResources.Department, 1) as CT LEFT OUTER JOIN HumanResources.Department d
ON d.DepartmentID = CT.DepartmentID;
Trang 10Determining latest version per row
The Change Tracking version is a database-wide version number, but it is possible to determine the
latest version for every row in a table, regardless of the last synchronization, using theChangeTable’s
versionoption TheCROSS APPLYcalls the table-valued function for every row in the
outer query:
SELECT d.DepartmentID, CT.SYS_CHANGE_VERSION
FROM HumanResources.Department d
CROSS APPLY ChangeTable
(Version HumanResources.Department, (DepartmentID),
(d.DepartmentID)) as CT ORDER BY d.DepartmentID;
Result (abbreviated):
DepartmentID Sys_Change_Version
-
To find the last synchronized version per row since a specific version, useChangeTablewith the
Changesoption In this example, row 17 was last updated with version 2, so requesting the most
recent versions since version 2 returns aNULLfor row 17:
SELECT d.DepartmentID, CT.SYS_CHANGE_VERSION
FROM HumanResources.Department d
LEFT OUTER JOIN
ChangeTable (Changes HumanResources.Department, 2) as CT
ON d.DepartmentID = CT.DepartmentID
ORDER BY d.DepartmentID;
Result (abbreviated):
DepartmentID Sys_Change_Version
-
Capturing application context
It’s possible to pass information about the DML’s context to Change Tracking Typically the context
could be the username, application, or workstation name The context is passed as avarbinarydata
type Adding context to Change Tracking opens the door for Change Tracking to be used to gather
OLTP audit trail data