2 From the toolbox, drag and drop an OLEDB source onto the Data Flow canvas.Double-click the OLEDB Source connection adapter to edit it.. Defining the lookup transformation Now that the
Trang 1SQL Server 2008 are the Change Tracking and Change Data Capture tures which, as their names imply, automatically track which rows havebeen changed, making selecting from the source database much easier Now that we’ve looked at an incremental load using T-SQL, let’s consider how SQLServer Integration Services can accomplish the same task without all the hand-coding.
fea-Incremental loads in SSIS
SQL Server Integration Services (SSIS) is Microsoft’s application bundled with SQLServer that simplifies data integration and transformations—and in this case, incre-mental loads For this example, we’ll use SSIS to execute the lookup transformation(for the join functionality) combined with the conditional split (for the WHERE clauseconditions) transformations
Before we begin, let’s reset our database tables to their original state using the T-SQLcode in listing 8
USE SSISIncrementalLoad_Source GO
TRUNCATE TABLE dbo.tblSource INSERT INTO dbo.tblSource (ColID,ColA,ColB,ColC) VALUES(0, 'A', '1/1/2007 12:01 AM', -1) insert a "changed" row
INSERT INTO dbo.tblSource (ColID,ColA,ColB,ColC) VALUES(1, 'B', '1/1/2007 12:02 AM', -2) INSERT INTO dbo.tblSource
(ColID,ColA,ColB,ColC) VALUES(2, 'N', '1/1/2007 12:03 AM', -3) USE SSISIncrementalLoad_Dest
GO TRUNCATE TABLE dbo.tblDest INSERT INTO dbo.tblDest (ColID,ColA,ColB,ColC) VALUES(0, 'A', '1/1/2007 12:01 AM', -1) INSERT INTO dbo.tblDest
(ColID,ColA,ColB,ColC) VALUES(1, 'C', '1/1/2007 12:02 AM', -2)With the tables back in their original state, we’ll create a new project using BusinessIntelligence Development Studio (BIDS)
Creating the new BIDS project
To follow along with this example, first open BIDS and create a new project We’llname the project SSISIncrementalLoad, as shown in figure 1 Once the project loads,open Solution Explorer, right-click the package, and rename Package1.dtsx to SSISIn-crementalLoad.dtsx
Listing 8 Resetting the tables
Insert unchanged row
Insert changed row
Insert new row
Insert unchanged row
Insert changed row
Trang 2Incremental loads in SSIS
When prompted to rename the package object, click the Yes button From here, low this straightforward series:
fol-1 From the toolbox, drag a data flow onto the Control Flow canvas Double-clickthe data flow task to edit it
2 From the toolbox, drag and drop an OLEDB source onto the Data Flow canvas.Double-click the OLEDB Source connection adapter to edit it
3 Click the New button beside the OLE DB Connection Manager drop-down.Click the New button here to create a new data connection Enter or select yourserver name Connect to the SSISIncrementalLoad_Source database you cre-ated earlier Click the OK button to return to the Connection Manager configu-ration dialog box
4 Click the OK button to accept your newly created data connection as the nection manager you want to define Select dbo.tblSource from the Tabledrop-down
con-5 Click the OK button to complete defining the OLEDB source adapter
Defining the lookup transformation
Now that the source adapter is defined, let’s move on to the lookup transformationthat’ll join the data from our two tables Again, there’s a standard series of steps in SSIS:
1 Drag and drop a lookup transformation from the toolbox onto the Data Flowcanvas
Figure 1 Creating a new BIDS project named SSISIncrementalLoad
Trang 32 Connect the OLEDB connection adapter to the lookup transformation by ing on the OLE DB Source, dragging the green arrow over the lookup, anddropping it.
click-3 Right-click the lookup transformation and click Edit (or double-click thelookup transformation) to edit You should now see something like the exam-ple shown in figure 2
When the editor opens, click the New button beside the OLEDB Connection Managerdrop-down (as you did earlier for the OLEDB source adapter) Define a new data con-nection—this time to the SSISIncrementalLoad_Dest database After setting up thenew data connection and connection manager, configure the lookup transformation
to connect to dbo.tblDest Click the Columns tab On the left side are the columnscurrently in the SSIS data flow pipeline (from SSISIncrementalLoad_Source.dbo.tblSource) On the right side are columns available from the lookup destinationyou just configured (from SSISIncrementalLoad_Dest.dbo.tblDest)
We’ll need all the rows returned from the destination table, so check all the checkboxes beside the rows in the destination We need these rows for our WHERE clausesand our JOIN ON clauses
We don’t want to map all the rows between the source and destination—only thecolumns named ColID between the database tables The mappings drawn betweenthe Available Input columns and Available Lookup columns define the JOIN ON clause.Multi-select the mappings between ColA, ColB, and ColC by clicking on them whileholding the Ctrl key Right-click any of them and click Delete Selected Mappings todelete these columns from our JOIN ON clause, as shown in figure 3
Figure 2 Using SSIS to edit the lookup transformation
Trang 4Incremental loads in SSIS
Add the text Dest_ to each column’s output alias These rows are being appended to
the data flow pipeline This is so that we can distinguish between source and tion rows farther down the pipeline
destina-Setting the lookup transformation behavior
Next we need to modify our lookup transformation behavior By default, the lookupoperates similar to an INNER JOIN—but we need a LEFT (OUTER) JOIN Click the Con-figure Error Output button to open the Configure Error Output screen On theLookup Output row, change the Error column from Fail Component to Ignore Fail-ure This tells the lookup transformation that if it doesn’t find an INNER JOIN match inthe destination table for the source table’s ColID value, it shouldn’t fail This alsoeffectively tells the lookup to behave like a LEFT JOIN instead of an INNER JOIN Click
OK to complete the lookup transformation configuration
From the toolbox, drag and drop a conditional split transformation onto the DataFlow canvas Connect the lookup to the conditional split as shown in figure 4 Right-click the conditional split and click Edit to open the Conditional Split TransformationEditor The Editor is divided into three sections The upper-left section contains a list
of available variables and columns The upper-right section contains a list of availableoperations you may perform on values in the conditional expression The lower sec-tion contains a list of the outputs you can define using SSIS Expression Language.Figure 3 Using the Lookup Transformation Editor to establish the correct mappings
Trang 5Expand the NULL Functions folder in the upper-right section of the ConditionalSplit Transformation Editor, and expand the Columns folder in the upper-left section.Click in the Output Name column and enter New Rows as the name of the first output.From the NULL Functions folder, drag and drop the ISNULL( <<expression>> ) func-tion to the Condition column of the New Rows condition Next, drag Dest_ColID
from the Columns folder and drop it onto the <<expression>> text in the Condition
col-umn New rows should now be defined by the condition ISNULL( [Dest_ColID] ).This defines the WHERE clause for new rows—setting it to WHERE Dest_ColID Is NULL Type Changed Rows into a second output name column Add the expression(ColA
!= Dest_ColA) || (ColB != Dest_ColB) || (ColC != Dest_ColC) to the Condition umn for the Changed Rows output This defines our WHERE clause for detectingchanged rows—setting it to WHERE ((Dest_ColA != ColA) OR (Dest_ColB != ColB) OR(Dest_ColC != ColC)) Note that || is the expression for OR in SSIS expressions.Change the default output name from Conditional Split Default Output toUnchanged Rows
It’s important to note here that the data flow task acts on rows It can be used tomanipulate (transform, create, or delete) data in columns in a row, but the sources,destinations, and transformations in the data flow task act on rows
In a conditional split transformation, rows are sent to the output when the SSISExpression Language condition for that output evaluates as true A conditional splittransformation behaves like a Switch statement in C# or Select Case in Visual Basic,
in that the rows are sent to the first output for which the condition evaluates as true.This means that if two or more conditions are true for a given row, the row will be sent
to the first output in the list for which the condition is true, and that the row will never
be checked to see whether it meets the second condition Click the OK button to plete configuration of the conditional split transformation
Drag and drop an OLEDB destination connection adapter and an OLEDB mand transformation onto the Data Flow canvas Click on the conditional split andconnect it to the OLEDB destination A dialog box will display prompting you to select
com-a conditioncom-al split output (those outputs you defined in the lcom-ast step) Select the NewRows output Next connect the OLEDB command transformation to the conditionalsplit’s Changed Rows output Your Data Flow canvas should appear similar to theexample in figure 4
Configure the OLE DB destination by aiming at the Load_Dest.dbo.tblDest table Click the Mappings item in the list to the left Makesure the ColID, ColA, ColB, and ColC source columns are mapped to their matching
SSISIncremental-destination columns (aren’t you glad we prepended Dest_ to the SSISIncremental-destination
col-umns?) Click the OK button to complete configuring the OLE DB destination nection adapter Double-click the OLEDB command to open the Advanced Editor forthe OLE DB Command dialog box Set the Connection Manager column to yourSSISIncrementalLoad_Dest connection manager Click on the Component Proper-ties tab Click the ellipsis ( ) beside the SQLCommand property The String Value
Trang 6Incremental loads in SSIS
Editor displays Enter the following parameterized T-SQL statement into the StringValue text box:
UPDATE dbo.tblDest SET
ColA = ? ,ColB = ? ,ColC = ? WHERE ColID = ?The question marks in the previous parameterized T-SQL statement map by ordinal tocolumns named Param_0 through Param_3 Map them as shown here—effectivelyaltering the UPDATE statement for each row:
UPDATE SSISIncrementalLoad_Dest.dbo.tblDest SET
ColA = SSISIncrementalLoad_Source.dbo.ColA ,ColB = SSISIncrementalLoad_Source.dbo.ColB ,ColC = SSISIncrementalLoad_Source.dbo.ColC WHERE ColID = SSISIncrementalLoad_Source.dbo.ColID
As you can see in figure 5, the query is executed on a row-by-row basis For mance with large amounts of data, you’ll want to employ set-based updates instead.Figure 4 The Data Flow canvas shows a graphical view of the transformation.
Trang 7perfor-Click the OK button when mapping is completed If you execute the package withdebugging (press F5), the package should succeed
Note that one row takes the New Rows output from the conditional split, and onerow takes the Changed Rows output from the conditional split transformation.Although not visible, our third source row doesn't change, and would be sent to theUnchanged Rows output—which is the default Conditional Split output renamed.Any row that doesn’t meet any of the predefined conditions in the conditional split issent to the default output
Summary
The incremental load design pattern is a powerful way to leverage the strengths of theSSIS 2005 data flow task to transport data from a source to a destination By using thismethod, you only insert or update rows that are new or have changed
Figure 5 The Advanced Editor shows a representation of the data flow prior to execution.
Trang 8SummaryAbout the author
Andy Leonard is an architect with Unisys corporation, SQLServer database and integration services developer, SQL ServerMVP, PASS regional mentor (Southeast US), and engineer He’s
a coauthor of several books on SQL Server topics Andyfounded and manages VSTeamSystemCentral.com and main-tains several blogs there—Applied Team System, Applied Data-base Development, and Applied Business Intelligence—andalso blogs for SQLBlog.com Andy’s background includes webapplication architecture and development, VB, and ASP; SQLServer Integration Services (SSIS); data warehouse develop-ment using SQL Server 2000, 2005, and 2008; and test-drivendatabase development
Trang 10ACLs See access control lists
ACS See Audit Collection
Ser-viceaction groups 673
AUDIT_CHANGE_GROUP673
DBCC_GROUP 673SCHEMA_OBJECT_CHANGE_GROUP 673Action on Audit Failure 365
Action property 355
actions 621–623, 625, 673
binding 626T-SQL stack 625types 625Active Directory 345, 498, 500,
517configuring 503domains 500–501, 503Domains and Trust 502–503forest 501, 503
requirements 503, 508trees 501
User and Computers 504Active Domain authentication
credentials 270
active queries 390ActiveSync-connected device 301
ad hoc full backup 447
ad hoc queries 399, 598, 600largest 598
ad hoc reports 687
ad hoc SQL 7, 217, 554
ad hoc workloads 452AddDevice method 355administrative
considerations 258ADO.NET 213, 227, 262, 264–267, 273, 299, 307,
346, 351ADO.NET 2.0 259, 263, 274ADO.NET 3.5 210, 226, 268ADO NET 3.5 SP1 210code 210, 228
connection manager 715conversions 227
data providers 302data table 208factory-based objects 228factory-method-based objects 227objects 226OLE DB data provider 663provider 303
provider model 210SqlClient data provider 272ADO.NET Entity
Framework 210, 216, 219Advanced Schedule
Options 468
AdventureWorks 30, 47, 87,
111, 178, 182, 185, 188, 541–542, 547, 585–586SQL Server 2005 version 541AdventureWorks2008 database 111–112, 177,
189, 201, 376, 646AdventureWorksDW database 235AdventureWorksDW2008 691AdventureWorksLT2008 database 225, 228, 230ADW
aggregations 705designing aggregations 702query patterns 701
Specify Object Counts 703AES 298
AFTER triggers 20, 558, 561Age in Cache column 593Age of Empires 269Agent jobs 330–331, 338, 342Agent schedule 331
Aggregate component
See asynchronous
compo-nentsaggregate queries 601aggregated levels 700aggregation
candidates 702–703aggregation design 700, 702–703, 705, 707high-performance cubes 708
index
Trang 11aggregation design (continued)
influencing 702
partition level 707
Aggregation Design
Wizard 701performance benefit 701
aggregations 247, 558, 639,
700Aggregation Manager 706
Analysis Services 701
cache 700
comparing with
indexes 701cost of designing 707
Amazon.com 634
AMO See Analysis
Manage-ment ObjectsAnalysis Management
Objects 353
Analysis Services 260, 636, 638,
645, 692, 700ADW 701aggregations 701attribute relationships 704candidate attributes for aggregation 702designing hierarchies 707fact table 708
hierarchy types 708leveraging aggregations 707query log 705
user-defined hierarchy 704Analysis Services 2005 703, 706
Aggregation Management tool 707
Analysis Services 2005 cubesaggregations 706
Analysis Services 2008 703aggregations 706core concept 704usage-based optimization 707Analysis Services ADW 707Analysis Services cubes 634,
637, 639Analysis Services engineUBO 706
Analysis Services UDM cubes 698
anonymous rowsets 216, 218anonymous types 213, 216, 218
ANSI SQL standards 217ranking functions 217windowing functions 217ANSI standard
specification 45windowing extensions 45ApexSQL Audit 672application architecture 276application code 25
application connection strings 461
See also connection strings
application data 359application developers 25application ecosystem 323application integration 322, 326
Application Log 673, 675Application Log Server Audit 374
application profile 4application server 222application testing 460application virtualization 519
ApplicationLog server audit 372
appname 553architects 6–7architectural considerations 258architecture 298archiving 322–323, 480ARM 298
array index 122arrays 221artificial intelligence 688ASCII format 662ASP 263, 642, 654, 658ASP applications 263ASP.NET 509, 512applications 263, 265, 267–268, 270–271, 273service 270
ASP.NET site, average tions used 274
connec-ASP-type application 263connection 263Assembly object 354associated entities 216Association Rules 690associative database tables 170associative entities 155, 163,
167, 169asynchronous 457audit 371bucketizing target 625components 745database mirroring 462file target 625
mode See high-performance
modeoperations 265, 267ring buffer target 625targets 629
Attribute Discrimination viewer 695
attribute relationships704–705, 708designing 705attribute-by-attribute basis 31attributes 153, 688
combinations of dependencies 40functional dependencies 28functionally dependent attributes 34
many-attribute relation 39audit action groups 366, 372–373, 376, 379audit actions 366audit application 498Audit Collection Service 370
Trang 12audit events 365–366,
370–371, 374viewing 374audit files 371, 374
information 469audits 350, 673
authentication 269
credentials 270methods 384, 510protocol 498Auto Close 559
auto create statistics 564
management 478automatic redundancy 462
Automatically Detect Settings
option 506automation engine 344, 347
auto-parameterization
211–212Auto-Regression Trees 691
Auto-Regressive Integrated
Moving Average 691auxiliary table 283
average page density 617
average page space 404
native backup compression 453BACKUP DATABASE 384, 436
backup database 331–332, 338
backup device 339backup drives 451backup file name 356backup files 451, 453removal 331backup history 331Backup object 355backup routine 451backup script 465backup strategy 356backup type 338BackupDeviceItem 358BackupDevices 350BackupDirectory property 355backups 322, 326–327, 330, 355–356, 432, 450–451,
461, 518backup files 355backup sizes 432backup time 432compressing 432disaster recovery testing 358eliminating index data 432, 447
energy consumption 432energy-efficient 432filegroup backups 432frequency 467full backup 453, 454full database backups 433, 435
restoration 433, 447restore 357
routine backups 432BackupSetDescription property 355BackupSetName property 355bad indexes 601
base tables 213, 242, 424, 681baseline value 679
BASIC 7basic form 46concurrency 46exponential performance scaling 46
locking 46performance 46transaction log 46batch commands, replicating 490batch file 468creating 466location 464batch updates 485, 487, 746network latency 491batch-by-batch basis 219BatchCommitSize setting 486BatchCommitThreshold setting 486
batches 74, 214, 244, 488, 490, 576
batch size 247deletes 486DML 487variables 247batching 246BCP 662data 477files 235, 477process 480BCP partitioning 480benefits 480bcp utility 104command line 104command window 104error 104
query window 104xp_cmdshell 104before and after values 685benchmark results 544benchmarking tool 609Bertrand, Aaron 404best practices 323, 359
BI 3application 635developer 741functionality 645suite 698tools 645
See also business intelligence
suiteBI-based applications 328bi-directional link 262BIDS 693, 696, 698, 703, 713,
717, 754
See also Business Intelligence
Development StudioBIDS 2008, Attribute Relation-ships tab 705
Trang 13BIDS project, creating 754
big bitmask 253
Big Four PerfMon
counters 577CPU 577
buffering 626buffers 383, 626BufferSizeTuning 749Build 3215 459built-in functions 201bulk copy methods 12bulk import 104bulk import file 104bulk insert 103, 109BULK INSERT 106, 108Bulk Insert Task 108Bulk Insert Task Editor 109bulk loading 685
bulk copying 477bulk-logged recovery 108, 480business cycle 394
business data 634, 641business domain 163Business Intelligencedevelopment toolkit 645business intelligence 322, 328, 633–634, 687, 709, 726applications 322difference with legacy OLTP applications 641
project 636reporting 633
RS reports 687specialist 328terminology 634traditional OLTP approach 640Business Intelligence Develop-ment Studio 109, 692,
701, 713, 754Business Intelligence Projects 645business intelligence solution 635back-end tools 640dimensional model 636front-end analytics 640overall strategy 636relational data warehouse store 640
subject areas 636tools 640business intelligence suite 687business logic 6
business problems 689Business Scorecard Manager 640
315, 321, 344, 346, 348, 758C++ 299
C2 Audit Mode 670C2 Auditing feature 673CAB files 300
CAB-based installations 298cache, multiple plans 211cache hit 624
Cache Hit Ratios 385cacheability difference 212cached stored procedures 593caching application 677calendar table 283calling program 80candidate key attributes 39Candidate Key profile 712canonical problem 222capacity planning 323capture host name 685capture instance table 682captured data 684cardinality 17, 19cascading operations 20, 21case 688
attributes 688table 690variables 688CASE statement 251catalog 177, 181, 199name 179offline 177tables 644views 232, 375, 382, 620Catalog Name 194catalogs 192–193advanced queries 194queries to retrieve 192catastrophic cluster 462catastrophic loss 525
CD 527CDC 685functions 682instance 682table 683
See also change data capture
Central Management Server 450certificate store 345Change Data Capture 378, 754
change data capture 402, 670, 681
function 683
Trang 14AUTO 179, 181data 685DISABLE 181enabling 677MANUAL 179mode 180, 192
NO POPULATION 179OFF 179
CHANGE_RETENTION 677
CHANGE_TRACKING
AUTO 179MANUAL 180CHANGE_TRACKING_
CURRENT_VERSION678
change_tracking_state_desc 1
92–193changed data 681
updating 753changed rows, isolating 753
Changed Rows output 758,
760CHANGETABLE 678
function 680char 176
checkpoints 594, 609–610,
613checkpoint I/O requests 613
synchronization providers 306, 308tables 306
client-side executable 263clock drift 518, 525clogged network 273CLR 23, 239, 244, 253, 327aggregate function 282code 235
executables 645language 244
See also Common Language
Runtimecluster failover 462cluster service 450clustered index 15, 213, 247,
407, 411, 421, 546, 583–585, 588clustered indexes 5, 445,
541, 600, 614correlation 583distribution statistics 588key 425
pages 584scan 571, 573, 586–587storage 437
query plan 587clustered instance 461clustered key lookups 544clustered remote
distributor 492clustered servers 275clustered tables 407, 411clustered unique index 16ClusteredIndexes
property 363Clustering 690clustering 324, 328clustering index 542, 544cmdlets 345, 348
SQL Server 2008 350COALESCE 112COBOL 7, 45Codd 10code errors 75code module 221–224, 226, 233
code modules 206CodePlex 225, 364, 376, 706coding practices 20coding-standards 89cold cache 583collation 201, 237collection objects 716Column Length Distribution profile 711, 714, 723column list 281
column mappings 756
Column Null Ratio profile
711, 714, 719, 723Column Pattern checking 723Column Pattern profile 711,
713, 715, 720, 722flattened structure 722hierarchical structure 722column statistics 336Column Statistics Only 336Column Statistics
profile 712Column Value Distribution profile 711
ColumnMappings 307columns, full-text indexed 193COM legacy 266COM-based ADO 262, 266command object 228comma-separated lists 243CommitBatchSize 493–495CommitBatchThreshold493–495
commodity hardware 461commodity level servers 456commodity-level hardware 449common code 288
state tax rates 288StateTaxRates table 288Common Language Runtime 235, 327, 401common table expressions 64,
146, 214, 240, 247queries 560common version store 558commutative 87
compatibility level 459compatibility mode 459compile errors 74compiled assemblies 244compiler 204
Component Properties 758composite foreign keys 5composite indexes 569, 574composite primary keys 5compressed backup 458computed columns 223COM-style registration 300concurrency 9, 298, 436issues 541
concurrent snapshots 480conditional split 757–758, 760Conditional Split Default Output 758
Conditional Split Editor757
Conditional Split output760
Trang 15conditional split output 758
Conditional Split
transformation 723, 735Conditional Split Transforma-
tion Editor 757conditional split
transformations 754, 758configuration 758
releasing resources 275
Connection Pooling tab 663
connection pools, client 274
Connection Reset 268, 275
connection strategy 263, 268, 276
connect-query-disconnect263
just-in-time connection strategy 276
connection strings 268, 269,
270, 274, 455–456, 461,
643, 662failover partner 455modified connection string 455
Connection switch 716Connection Timeout 267connection-scoped server state 264
ConnectionString property
267, 646, 716, 737connectivity 263, 455connect-query-disconnect 263consonants 200, 206
Constant Scan operator 426, 429
constrained delegation 502,
504, 508ConstraintOptions property731
constraints 18, 44constraint violations 163pitfalls in finding 32container objects 232containers 726–727CONTAINS 181–184, 200, 202
contains function 126CONTAINS
INFLECTIONAL203
CONTAINSTABLE 183–184, 203
RANK column 184continuous variables 690–691control flow 726, 743
components 729, 734configuration 730logic 730
performance 743XML task 720Control Flow elements 743Controller method 261conventional disk drive 606CONVERT function 137–139Convert-UrnToPath 350copy 453, 459
Copy Column transformation744
SSIS variable 744
core database engine 299corporate datacenters 526corporate environment 526corporate policy 465correlated subqueries 23, 218correlation 583–584
determining 588high correlation 585, 588–589
low correlation 585, 587when to expect
correlation 588corruption problem 333COUNT_ROWS 386covered query 543covering indexes 213, 541–544, 545definition 213disadvantages 541modifications 545performance 213update performance 546CPU 383–384, 451, 480, 525, 576–577, 594
cores 563, 602cycles 262, 543resource availability 523resources 523, 524time 398
usage 523utilization 564, 592CPU pressure 591–593cached stored procedures 593DMV queries 592how to detect 592performance issue 592runnable tasks count 593CPU queries
expensive 594crash recovery 607CREATE AGGREGATE 245CREATE DATABASE 433Create Database Audit Specification 379CREATE FULLTEXT INDEX 178CREATE INDEX 435, 585DDL 399
WITH DROP_
EXISTING 438–439Create Server Audit Specification 373CREATE TABLE 404CreateNewTableOrFail 309Credentials 350
credentials 502
See also user credentials
Trang 16CRISP-DM 689
See also Cross-Industry
Standard Process for Data Mining
Crivat, Bogdan 698
CRM 688
customers relationship management 688CROSS 86
APPLY 238JOIN 88cross-database references 21
cross-database referential
integrity 20Cross-Industry Standard Pro-
cess for Data Mining 689, 698
Cross-platform 298
CRUD 307
CSV 644, 653
format 662, 666CTEs 66, 71, 240, 247
See also common table
expressioncube
data 639design 636, 707designer 703loading 634partitions 640processing 639store 635cube-processing
overhead 701time 707cubes 638, 700, 703
aggregation 700analyzing disk space 641data store 639
data warehouse 639large cubes 707larger cubes 701smaller cubes 701T-SQL 638validating 707cumulative update 462
cumulative waits 592
current baseline version 678
current context node 126
current database 280
Current Disk Queue
Length 608current_tasks_count 603
current-next pairs 63
cursor-based code 45
cursors 3, 7–8, 20, 45, 65, 69,
558keyset-driven 560
overhead 65static-driven 560custom data arrays 654custom error messages 76custom error table 79custom keyboard shortcuts
277, 279, 281–282productivity 282custom log shipping 473custom logging 385custom objects 743custom profiles 490custom scripts 743custom shortcuts 282custom stoplist 191custom stored procedures 279custom sync objects 492custom thesaurus 186, 188custom update stored procedure 494customers relationship management 688
D
DAC 476remote access 476
See also Dedicated
Adminis-trator ConnectionDAC connection 476
DAI See data access interfaces
data access, regulations 670developers 265
mode 745providers 263, 265stacks 210technologies 455data access interfaces 262–263, 266Data Access Mode property736
data acquisition performance744
Copy Column transformation 744Row Count
transformation 744data adapter 307data architect 12data archiving 322, 421–422data attributes 30
data availability 421data backup strategy 526data cache server 671data caches 303, 676data centers 297migration 325
data collectionasynchronous mode 293frequency 293
synchronous mode 293data connection 755–756data containers 221data definition language 265, 421
data design 28data destination 743aggregate 745flat file 746multiple 745SQL Server destination 747data distribution 44
data domain 156data dumps 5, 642data element 156data encryption 326data exclusion 719data exploration 710data exports 171, 326data extraction 436data extraction tool 662data files 451–452data flow 722, 726, 741, 744, 748
Column Pattern profile 722component properties 734data acquisition
performance 744data acquisition rate 744destination loading 744destination updating 744lookups 744
performance 748source acquisition 744transforming data 744XML source 720–721Data Flow elements 743data flow performanceBLOB data 749BLOBTempStoragePath property 749
BufferSizeTuning event 749DefaultBufferMaxRows property 748DefaultBufferSize property 748data flow pipeline 737, 757data flow records 735data flow task 726, 729, 734,
738, 740, 743, 744, 758average baseline 744destinations 758executables 744execution trees 744
Trang 17data flow task (continued)
data manipulation
language 146, 422Data Manipulation Language
commands 485data marts 633
See also data warehouse
statistics 688
Data Mining
Designer 693–694, 696–697
Data Mining Extensions
language 694data mining models 689, 697
data mining project 689
Data Mining Query task 697
Data Mining Query
transformation 697data mining techniques
directed algorithms 688
undirected algorithms 688
Data Mining Viewers 693Data Mining Wizard 692data model 30, 155, 157, 636data store 160
natural keys 155primary key constraints 155surrogate keys 155
unique constraints 155data modeler ratio 5data modelers 5, 28, 30domain experts 28data modeling 28–29, 36, 323missing attributes 36data objects 270data overview 689, 698pivot graphs 690pivot tables 690data pages 405, 407–411, 574, 584–585, 587
space used 405data patterns 709data pipeline 726data points 639data preparation 689, 698Data Processing
Extension 644data processing extension 644data profile 709
information 714results 715data profile output 715, 721, 723
filtering 721Data Profile Viewer 709, 714–715
XML 715data profiles 711–712, 715types 711
data profiling 709–710, 719tools 709
Data Profiling task 709, 711–714, 719ADO.NET connection manager 715applying XML 718connection managers 716constraints 715
Data Profiling Task Editor 716data quality 719data sources types 715dynamic 716
ETL flow 719script tasks 723SQL Server 2000 715transforming output with XSLT 721
XML output 720
Data Profiling Task Editor 713, 716, 718Data Profiling task output 721data protection 15, 26data providers 654data purging 322, 421–422data quality 12–13, 712decisions 719data queries 654data records 305data redirecting 20data redundancy 155, 157data retrieval, optimizing 215data sampling 716
data size 322Data Source 267Data Source key 269Data Source View Wizard 690, 692
data sources 644, 649, 651,
663, 709–710, 735, 743data storage 327
data store 293data structure allocations 439data synchronization 298, 305, 316
data transformation 745best practices 745Lookup transformation 747performance 745
single Aggregate transformation 745data trends 671data type casting 745data types 12, 24, 26, 156data validation 24, 636data visualization 650data warehouse 326, 328, 636,
638, 671, 687, 720administrators 633cube partitions 640design 635
field 751incremental loads 636initial loading 636modeling 635project 641quality 637relational 637tables 634data warehousing 322, 633, 635–636, 709
dimensional modeling 633methodologies 633relational design 633data-access layer 6database administration 633
Trang 18database administrators 192,
355, 421–422, 633, 670database application
development 323database applications 9, 576
Database Audit Specification
object 365Database Audit
Specifications 379database audit
specifications 370, 376,
378, 673multiple 376database audits 378
database availability 449
database backups 331, 451,
453database best practices 323
performance 661Database Engine Tuning
Advisor 385database failover 455
database files 458
free space 334database free space 421
database integrity 331
database level 376, 449
audit 365Database Mail 471
database metadata 217
database migration 458
database mirroring 324, 367,
401, 433, 449, 459, 559, 671
advantages 449automatic failover 449best practices 450business constraints 459case study 459
connection strings 455down time 457failover 456fast failover 449
high-performance mode 454high-safety mode 454monitoring 454moving data seamlessly 459multiple mirrored
databases 456non-traditional uses 462orphaned mirror 457page repair 451performance 455preparing the mirror 453prerequisites 450routine maintenance 456session 455, 459
superhero solution 460Database Mirroring Monitor 454–456, 461alerts 455
database status 455database mirroring session 455trial run 455database mirrors 460database model 219database network traffic 214database normalization 626database objects 230, 280–282,
359, 439scripting 438database operations 323database optimization 219database options 559database patterns 29, 31database performance 214,
325, 590problems 576database programmer 213database programming 545database protection 24database recovery model 356database restores 452Windows Instant File Initialization 452database schemas 7, 323, 327database server 222, 229, 456,
497, 594database snapshots 559, 671database source name 664database status 455alerts 455Database Mirroring Monitor 455Principal 455Suspended 455Synchronized 455database transaction logging 609
database tuning techniques 594database updates 178database users 271DATABASE_MIRRORING455
DATABASEPROPERTY counter 292DATABASEPROPERTY function 292DATABASEPROPERTYEX function 292
Databases 350data-binding 213datacenter license 518data-centric 726DataContext 216data-driven database 6DataLoadOptions 216data-quality checking 723data-quality issue 723DataReader object 264DataSet object 221, 264, 649datasets 297
DataTable 221, 231object 227, 654data-tier 456date boundaries 429DATE data type 429Date dimension 637date format 289date formatting function 289DATE_CORRELATION_OPTIMIZATION 583DATEADD function 63–65,
67, 284negative values 285past dates 285DateCreated attribute 471DATEDIFF function 63–65dates result set 285datetime data type 429Davidson, Tom 382DB2 745
disaster recovery 323hardware configuration 323managing any
applications 324managing test environments 324meetings 322
Trang 19specialist 328DBA database architect 327
DBA database designer 327
DBA developers 327
DBA manager 327
DBA project manager 327
DBA report writer 328
DBA system
administrators 327DBCC CHECKDB 333, 341,
559–560DBCC CHECKDB
operations 558DBCC
CLEANTABLE 410–411DBCC commands 381, 402,
410, 619DBCC DBREINDEX 439
DBCC
DROPCLEAN-BUFFERS 236, 612DBCC INPUTBUFFER 566
DBCC PAGE 554, 555
DBCC SHOWCONTIG 386,
616DBCC
SHRINKDATABASE 334DBCC SHRINKFILE 445, 562
priority 550processes 552resources 552victim 550deadlock graph 550–552, 554–556, 619
graphical view 551Management Studio 551Profiler 551
XML format 551deadlocks 25, 541, 628debugging information 490decision support system 452Decision Trees 691, 694model 696
declarative code 44development cost 44maintenance 44declarative coding 43–44amount of data 45benefits 44drawbacks 44learning curve 44limitations 43performance 45declarative constraints 20declarative
languages 218–219optimizing 219declarative query 43declarative referential integrity 481declarative solutions 43Decode-SqlName 350decryption 564Dedicated Administrator Connection 437, 439, 441dedicated administrator connection 476dedicated filegroup 442, 445disk space 445
dedicated SQL Server login 604dedicated system 269, 643default catalog 178–179default database 267default filegroup 360default instance 257, 268–269default partition 414
default table 433default trace 619
DefaultBufferMaxRows748–749
DefaultBufferSize 748–749deferred analysis 680deferred loading 216DeferredLoadingEnabled property 216
defragmentation 330–331, 334
indexes 335degree of parallelism 383Delaney, Kalen 26DelegConfig 512DELETE 242, 727triggers 23DeleteCommand 307deleted tables 21, 23delimited string 221delimiter, pipe symbol 287demo database 238denormalization 3denormalized data 4, 21DENSE_RANK function 64, 68
dependencies 277combinations 40
See also functional
dependenciesDependency Network viewer 695dependency tracking 387dependent attributes 37–38dependent column 712dependent entities 160dependent services 261deployment 299deprecated features 291benefits of tracking 291deprecated functionality 291Deprecation
Announcement 293event class 291deprecation events 294Deprecation Final Support 293event class 291deprecation policy 291DEPRECATION_
ANNOUNCEMENT 294DEPRECATION_FINAL_SUPPORT 294Derived Column transformation 723, 735derived tables 63, 67, 92derived variables 690Description 307Designer 332, 338, 341–342desktop platforms 297
Trang 20destination table 751,
756–757destinations 726
See also data destination
destructive loads 751
advantage 751determinant columns 712
Developer Edition
CDC 681developer-centric 316
development
environment 456development tools 727
aggregations 701dimension key 704
dimension level 707
dimension structures 640
dimension tables 634, 637,
747dimensional designs 636–637
dimensional model 636, 638,
641Analysis Services 636dimensional modeling 634,
637, 641dimensions 634, 637–639
Analysis Services 636directed algorithms 688
156data-centric solution 150
disconnected data problem 305discrete variables 690disk access speed 607disk contention 523disk failures 355disk I/O 217, 574, 576, 606activities 581, 608counters 606latency counters 608performance 520, 521, 523, 607
performance counters 606,
607, 608, 609, 616, 618requests 610
subsystem 609throughput 613, 617Disk I/O latency 608Disk I/O size 608disk read throughput 615disk speed 179
disk storage subsystem 613disk usage 640
DISKPERF 578dispatchers 623dispatching 622Display Term 197DISTINCT 238, 244distinct values 712Distributed Management Objects 353
distributed transactions 685Distribution Agent 486–490, 493–495
command line 490distribution agent 480–481, 485
Distribution Agent metrics 485Distribution Agents 495tuning 492
distribution database 486–488, 492
distribution statistics 588distributor 492
hardware 485, 492.DLL files 300DLLs 354
DM model 687–688
DM techniques 688
DM See data mining
DMFs 196, 382, 402table-valued DMF 386
See also Dynamic
Manage-ment FunctionsDML 214–215, 422, 450activities 377events 376
operation 21queries 674
See also data manipulation
languageDMOscripting 362
See also Distributed
Manage-ment ObjectsDMV categories 401DMV measurements 592DMV metrics 592DMV queries 405–406, 408, 410–411, 590–591, 597, 600
average worker time 594indexes 600
performance problems 604security 590
See also Dynamic
Manage-ment View queriesDMV structure 382DMVs 382–383, 390, 402, 404, 575
missing index DMVs 399
See also Dynamic
Manage-ment ViewsDMVStats CodePlex project 382DMX language 697
See also Data Mining
Exten-sions languageDMX prediction query 697–698DMX query 698DNS 504, 668
See also Windows Directory
Name ServiceDNS name 468DNS service 268Document Count 197Document ID 198document order 129document types 195document_type 195documentationmaintaining 324domain 33, 235, 464account 464, 473expert 28–34, 36, 38–40group 269
user 663DOP 383
See also degree of parallelism
dormant connection 274–275Dorr, Bob 609
double insert 4double update 4