Microsoft SQL Server 2008 R2 Unleashed- P160 pot

Also, when more than one SQL Server instance is on the same physical server, you need to divide the memory correctly for each.. Database allocations—We like to use an approach of putting

Trang 1

Performance and Tuning Design Guidelines

We outline some of the major performance and tuning design guidelines here There are,

of course, many more, but if you a least consider and apply the ones outline here, you

should end up with a decently performing SQL Server implementation As we have

described previously, performance and tuning should first be “designed in” to your SQL

Server implementation Many of the guidelines discussed here can be adopted easily in

this way However, when you put off the performance and tuning until later, you have

fewer options to apply and less performance improvement when you do make changes

Remember, addressing performance and tuning is like peeling an onion And, for this

reason, we present our guidelines in that way—layer by layer This approach helps provide

you with a great reference point for each layer and a list you can check off as you develop

your SQL Server–based implementation Just ask yourself whether you have considered the

specific layer guidelines when you are dealing with that layer Also, several chapters take

you through the full breadth and depth of options and techniques introduced in many of

these guidelines We point you to those chapters as we outline the guidelines

Hardware and Operating System Guidelines

Let’s start with the salient hardware and operation system guidelines that you should be

considering:

Hardware/Physical Server:

Server sizing/CPUs—Physical (or virtual) servers that will host a SQL Server

instance should be roughly sized to handle the maximum processing load plus 35%

more CPUs (and you should always round up) As an example, for a workload that

you anticipate may be fully handled by a four-CPU server configuration, we

recom-mend automatically increasing the number of CPUs to six We also always leave at

least one CPU for the operating system So, if six CPUs are on the server, you should

allocate only five to SQL Server to use You can find details on configuring CPUs in

Chapter 55, “Configuring, Tuning, and Optimizing SQL Server Options,” and details

on monitoring CPU utilization in Chapter 39, “Monitoring SQL Server

Performance.”

Memory—The amount of memory you might need is often directly related to the

amount of data you need to be in the cache to achieve 100% or near 100% cache hit

ratios This, of course, yields higher overall performance We don’t believe there is

too much memory for SQL Server, but we do recognize that some memory must be

left to the operating system to handle OS-level processing, connections, and so on

So, in general, you should make 90% of memory available to SQL Server and 10% to

the OS You can find details on configuring memory in Chapter 49 and details on

monitoring memory utilization in Chapter 39

Disk/SAN/NAS/RAID—Your disk subsystem can be a major contributor to

perfor-mance degradation if not handled properly We recognize that there are many

differ-ent options available here We generally try to have some separate devices on

Trang 2

different I/O channels so that disk I/O isolation techniques can be used This means

that you isolate heavy I/O away from other heavy I/O activity; otherwise, disk head

contention causes massive slowdowns in physical I/O When you use SAN/NAS

stor-age, much of the storage is just logical drives that are heavily cached This type of

situation limits the opportunity to spread out heavy I/O, but the caching layers

often alleviate that problem In general, RAID 10 is great for high update activity,

and RAID 5 is great for mostly read-only activity You can find more information on

RAID and storage options in Chapter 38, “Database Design and Performance.”

Operating System:

Page file location—When physical memory is exceeded, paging occurs to the page

file You need to make sure that the page file is not located on one of your database

disk locations; otherwise, performance of the whole server degrades rapidly

Processes’ priority—You should never lower the SQL Server processes in priority or

to the background You should always have them set as high as possible

Memory—As mentioned earlier, you should make sure that at least 10% of memory

is available to the OS for all its housekeeping, connection handling, process threads,

and so on

OS version—You should make sure you are using the most recent version of the

operating systems as you can and have updated with the latest patches or service

packs Also, often you must remove other software on your server, such as

special-ized virus protection We have lost track of the number of SQL Server

implementa-tions we have found that had some third-party virus software installed (and enabled)

on it, and all files and communication to the server were interrupted by the virus

scans Rely on Microsoft Windows and your firewalls for this protection rather than

a third-party virus solution that gets in the way of SQL Server If your organization

requires some type of virus protection on the server, at least disable scanning of the

database device files

Network:

Packet sizes/traffic—With broader bands and faster network adapters (typically at

least 1GB now), we recommend you utilize the larger packet sizes to accommodate

your heavier-traffic SQL Server instances Packets of 8KB and larger are easily

handled now Information on configuring the SQL Server packet size is available in

Chapter 49

Routers/switches/balancers—Depending on if you are using SQL clustering or

have multitiered application servers, you likely should utilize some type of load

bal-ancing at the network level to spread out connections from the network and avoid

bottlenecks

Trang 3

SQL Server Instance Guidelines

Next comes the SQL Server instance itself and the critical items that must be considered:

SQL Server configuration—We do not list many of the SQL Server instance

options here, but many of the default options are more than sufficient to deal with

most SQL Server implementations See Chapter 49 for information on all the

avail-able options

SQL Server device allocations—Devices should be treated with care and not

over-allocated SQL databases utilize files and devices as their underlying allocation from

the operating system You do not want dozens and dozens of smaller files or devices

for each database Having all these files or devices becomes harder to administer,

move, and manipulate We often come into a SQL Server implementation and

simplify the device allocations before we do any other work on the database At a

minimum, you should create data devices and log devices so that you can easily

isolate (separate) them

tempdbdatabase—Perhaps the most misunderstood SQL Server shared resource is

tempdb General guidelines for tempdbis to minimize explicit usage (overusage) of it

by limiting temp table creation, sorts, queries using DISTINCT clause, so on You are

creating a hot spot in your SQL Server instance that is mostly not in your control.

You might find it hard to believe, but indexing, table design, and even not executing

certain SQL statements can have a huge impact on what gets done in tempdband

have a huge effect on performance And, of course, you need to isolate tempdbaway

from all other databases For additional information on placing and monitoring

tempdb, see Chapters 38 and 39

masterdatabase—There is one simple guideline here: protect the master database at

all costs This means frequent backups and isolation of masteraway from all other

databases

modeldatabase—It seems harmless enough, but all databases in SQL Server utilize

themodeldatabase as their base allocation template We recommend you tailor this

for your particular environment

Memory—The best way to utilize and allocate memory to SQL Server depends on a

number of factors One is how many other SQL Server instances are running on the

same physical server Another is what type of SQL Server–based application it is:

heavy update versus heavy reads And yet another is how much of your application

has been written with stored procedures, triggers, and so on In general, you want to

give as much of the OS memory to SQL Server as you can But this amount should

never exceed 90% of the available memory at the OS level You don’t want SQL

Server or the OS to start thrashing via the page file or competing against each other

for memory Also, when more than one SQL Server instance is on the same physical

server, you need to divide the memory correctly for each Don’t pit them against

Trang 4

each other More information on configuring and monitoring SQL Server memory is

available in Chapters 39 and 49

Database-Level Guidelines

Database allocations—We like to use an approach of putting database files for

heavily used databases on the same drives as lightly used databases when more than

one database is being managed by a single SQL Server instance In other words, pair

big with small, not big with big This approach is termed reciprocal database pairing.

You should also not have too many databases on a single SQL Server instance If the

server fails, so do all the applications that were using the databases managed by this

one SQL Server instance It’s all about risk mitigation Remember the adage “never

put all your eggs in one basket.”

Databases have two primary file allocations: one for their data portion and the other

for their transaction log portion You should always isolate these file allocations

from each other onto separate disk subsystems with separate I/O channels if

possi-ble The transaction log is a hot spot for highly volatile applications (that have

frequent update activity) Isolate, isolate, and isolate some more There is also a

notion of something called reciprocal database device location More information is

available on this issue in Chapters 38 and 39

You need to size your database files appropriately large enough to avoid database file

fragmentation Heavily fragmented database files can lead to excessive file I/O

within the operating system and poor I/O performance For example, if you know

your database is going to grow to 500GB, size your database files at 500GB from the

start so that the operating system can allocate a contiguous 500GB file In addition,

be sure to disable the Auto-Shrink database option Allowing your database files to

continuously grow and shrink also leads to excessive file fragmentation as file space

is allocated and deallocated in small chunks

Database backup/recovery/administration—You should create a database

back-up and recovery schedule that matches the database back-update volatility and recovery

point objective All too often a set schedule is used when, in fact, it is not the

sched-ule that drives how often you do backups or how fast you must recover from failure

Table Design Guidelines

Table designs—Given the massively increased CPU, memory, and disk I/O speeds

that now exist, you should use a general guideline to create as “normalized” a table

design as is humanly possible No longer is it necessary to massively denormalize for

performance Most normalized table designs are easily supported by SQL Server

Normalized table designs ensure that data has high integrity and low overall

redun-dant data maintenance See Dr E F Codd’s original work on relational database

design (The Relational Model for Database Management: Version 2, Addison Wesley,

Trang 5

1990) Denormalize for performance as a last resort! For more information on

normalization and denormalization techniques, see Chapter 38

NOTE

Too often, we have seen attempts by developers and database designers to guess at

the performance problems they expect to encounter denormalizing the database design

before any real performance testing has even been done, This, more often than not,

results in an unnecessarily, and sometimes excessively, denormalized database design

Overly denormalized databases require creating additional code to maintain the

denor-malized data, and this often ends up creating more performance problems than it

attempts to solve, not to mention the greater potential for data integrity issues when

data is heavily denormalized It is always best to start with as normalized a database

as possible, and begin testing early in the development process with real data volumes

to identify potential areas where denormalization may be necessary for performance

reasons Then, and only when absolutely necessary, you can begin to look at areas in

your table design where denormalization may provide a performance benefit

Data types—You must be consistent! In other words, you need to take the time to

make sure you have the same data type definitions for columns that will be joined

and/or come from the same data domain—InttoInt, and so on Often the use of

user-defined data types goes a long way to standardize the underlying data types

across tables and databases This is a very strong method of ensuring consistency

Defaults—Defaults can help greatly in providing valid data values in columns that

are common or that have been specified as mandatory (not NULL) Defaults are tied

to the column and are consistently applied, regardless of the application that

touches the table

Check constraints—Check constraints can also be useful if you need to have

checks of data values as part of your table definition Again, it is a consistency

capa-bility at the column level that guarantees that only correct data ends up in the

column Let us add a word of warning, though: you have to be aware of the insert

and update errors that can occur in your application from invalid data values that

don’t meet the check constraints

Triggers—Often triggers are used to maintain denormalized data, custom audit logs,

and referential integrity Triggers are often used when you want certain behavior to

occur when updates, inserts, and deletes occur, regardless of where they are initiated

from Triggers can result in cascading changes to related (dependent) tables or

fail-ures to perform modifications because of restrictions Keep in mind that triggers add

overhead to even the simplest of data modification operations in your database and

are a classic item to look at for performance issues You should implement triggers

sparingly and implement only triggers that are “appropriate” for the level of

integrity or activity required by your applications, and no more than is necessary

Also, you need to be careful to keep the code within your triggers as efficient as

Trang 6

possible so the impact on your data modifications is kept to a minimum For more

information on coding and using triggers, see Chapter 30, “Creating and Managing

Triggers.”

Primary keys/foreign keys—For OLTP and normalized table designs, you need to

utilize explicit primary key and foreign key constraints where possible For many

read-only tables, you may not even have to specify a primary key or foreign key at

all In fact, you will often be penalized with poorer load times or bulk updates to

tables that are used mostly as lookup tables SQL Server must invoke and enforce

integrity constraints if they are defined If you don’t absolutely need them (such as

with read-only tables), don’t specify them

Table allocations—When creating tables, you should consider using the fill factor

(free space) options (when you have a clustered index) to correspond to the

volatil-ity of the updates, inserts, and deletes that will be occurring in the table Fill factor

leaves free space in the index and data pages, allowing room for subsequent inserts

without incurring a page split You should avoid page splits as much as possible

because they increase the I/O cost of insert and update operations For more

infor-mation on fill factor and page splits, see Chapter 34, “Data Structures, Indexes, and

Performance.”

Table partitioning—It can be extremely powerful to segregate a table’s data into

physical partitions that are naturally accessed via some natural subsetting such as

date or key range Queries that can take advantage of partitions can help reduce I/O

by searching only the appropriate partitions rather than the entire table For more

information on table partitioning, see Chapters 24, “Creating and Managing Tables,”

and 34

Purge/archive strategy—You should anticipate the growth of your tables and

determine whether a purge/archive strategy will be needed If you need to archive or

purge data from large tables that are expected to continue to grow, it is best to plan

for archiving and purging from the beginning Many times, your archive/purge

method may require modifications to your table design to support an efficient

archive/purge method In addition, if you are archiving data to improve

perfor-mance of your OLTP applications, but the historical data needs to be maintained for

reporting purposes, this also often requires incorporating the historical data into

your database and application design It is much easier to build in an archive/purge

method to your database and application from the start than have to retrofit

some-thing back into an existing system Performance of the archive/purge process often is

better when it’s planned from the beginning as well

Indexing Guidelines

In general, you need to be sure not to overindex your tables, especially for tables that

require good performance for data modifications! Common mistakes include creating

redundant indexes on primary keys that already have primary key constraints defined or

creating multiple indexes with the same set of leading columns You should understand

when an index is required based on need, not just the desire to have an index Also, you

Trang 7

should make sure that the indexes you define have sufficient cardinality to be useful for

your queries In most performance and tuning engagements that we do, we spend a good

portion of our time removing indexes or redefining them correctly to better support the

queries being executed against the tables For more information on defining useful

indexes and how queries are optimized, see Chapters 34, and 35, “Understanding Query

Optimization.”

Following are some indexing guidelines:

Have an indexing strategy that matches the database/table usages; this is paramount

Do not index OLTP tables with a DSS indexing strategy and vice versa

For composite indexes, try to keep the more selective columns leftmost in the index

Be sure to index columns used in joins Joins are processed inefficiently if no index

on the column(s) is specified in a join

Tailor your indexes for your most critical queries and transactions You cannot index

for every possible query that might be run against your tables However, your

appli-cations will perform better if you can identify your critical and most frequently

executed queries and design indexes to support them

Avoid indexes on columns that have poor selectivity The Query Optimizer is not

likely to use the indexes, so they would simply take up space and add unnecessary

overhead during inserts, updates, and deletes

Use clustered indexes when you need to keep your data rows physically sorted in a

specific column order If your data is growing sequentially or is primarily accessed in

a particular order (such as range retrievals by date), the clustered index allows you to

achieve this more efficiently

Use nonclustered indexes to provide quicker direct access to data rows than a table

scan when searching for data values not defined in your clustered index Create

nonclustered indexes wisely You can often add a few other data columns in the

nonclustered index (to the end of the index definition) to help satisfy SQL queries

completely in the index (and not have to read the data page and incur some extra

I/O) This is termed “covering your query.” All query columns can be satisfied from

the index structure

Consider specifying a clustered index fill factor (free space) value to minimize page

splits for volatile tables Keep in mind, however, that the fill factor is lost over time

as rows are added to the table and pages fill up You might need to implement a

database maintenance job that runs periodically to rebuild your indexes and reapply

the fill factor to the data and index pages

Be extremely aware of the table/index statistics that the optimizer has available to it

When your table has changed by more than 20% from updates, inserts, or deletes,

the data distribution can be affected quite a bit, and the optimizer decisions can

change greatly You’ll often want to ensure that the Auto-Update Statistics option is

enabled for your databases to help ensure that index statistics are kept up-to-date as

your data changes

Trang 8

View Design Guidelines

In general, you can have as many views as you want Views are not tables and do not take

up any storage space (unless you create an index on the view) They are merely an

abstrac-tion for convenience Except for indexed views, views do not store any data; the results of

a view are materialized at the time the query is run against the view and the data is

retrieved from the underlying tables Views can be used to hide complex queries, can be

used to control data access, and can be used in the same place as a table in the FROM

state-ment of any SQL statestate-ment

Following are some view design guidelines:

Use views to hide tables that change their structure often By using views to

provide a stable data access view to your application, you can greatly reduce

programming changes

Utilize views to control security and control access to table data at the data value level

Be careful of overusing views containing complex multitable queries, especially code

that joins such views together When the query is materialized, what may appear as

a simple join between two or three views can result in an expensive join between

numerous tables, sometimes including joins to a single table multiple times

Use indexed views to dramatically improve performance for data accesses done via

views Essentially, SQL Server creates an indexed lookup via the view to the

underly-ing table’s data There is storage and overhead associated with these views, so be

careful when you utilize this performance feature Although indexed views can help

improve the performance of SELECTstatements, they add overhead to INSERT,

UPDATE, and DELETEstatements because the rows in the indexed view need to be

maintained as data rows are modified, similar to the maintenance overhead of

indexes

For more information on creating and using views, see Chapter 27, “Creating and

Managing Views.”

Transact-SQL Guidelines

Overall, how you write your Transact-SQL (T-SQL) code can have one of the greatest

impacts on your SQL Server performance Regardless of how well you’ve optimized your

server configuration and database design, poorly written and inefficient SQL code still

results in poor performance The following sections list some general guidelines to help

you write efficient, faster-performing code

General T-SQL Coding Guidelines

UseIF EXISTSinstead of SELECT COUNT(*)when checking only for the existence of

any matching data values IF EXISTSstops the processing of the SELECTquery as

soon as the first matching row is found, whereas SELECT COUNT(*)continues

search-ing until all matches are found, wastsearch-ing I/O and CPU cycles

Trang 9

Using Exists/Not Exists in a sub-query is preferable to IN/ NOT IN for sets that are

queried As the potential target size of the set used in the IN gets larger, the

perfor-mance benefit increases

Avoid unnecessary ORDER BYorDISTINCTclauses Unless the Query Optimizer

deter-mines that the rows will be returned in sorted order or all rows are unique, these

operations require a worktable for processing the results, which incurs extra

over-head and I/O Avoid these operations if it is not imperative for the rows to be

returned in a specific order or if it’s not necessary to eliminate duplicate rows

UseUNION ALLinstead of UNIONif you do not need to eliminate duplicate result rows

from the result sets being combined with the UNIONoperator The UNIONstatement

has to combine the result sets into a worktable to remove any duplicate rows from

the result set UNION ALLsimply concatenates the result sets together, without the

overhead of putting them into a worktable to remove duplicate rows

Use table variables instead of temporary tables whenever possible or feasible Table

variables are memory resident and do not incur the I/O overhead and system table

and I/O contention that can occur intempdbwith normal temporary tables

If you need to use temporary tables, keep them as small as possible so they are created

and populated more quickly and use less memory and incur less I/O Select only the

required columns rather than usingSELECT *, and retrieve only the rows from the

base table that you actually need to reference The smaller the temporary table, the

faster it is to create and access the table

If a temporary table is of sufficient size and will be accessed multiple times, it is

often cost effective to create an index on it on the column(s) that will be referenced

in the search arguments (SARGs) of queries against the temporary table Do this only

if the time it takes to create the index plus the time the queries take to run using the

index is less than the sum total of the time it takes the queries against the temporary

table to run without the index

Avoid unnecessary function executions If you call a SQL Server function (for

example,getdate()) repeatedly within T-SQL code, consider using a local variable to

hold the value returned by the function and use the local variable repeatedly

throughout your SQL statements rather than repeatedly executing the SQL Server

function This saves CPU cycles within your T-SQL code

Try to use set-oriented operations instead of cursor operations whenever possible and

feasible SQL Server is optimized for set-oriented operations, so they are almost

always faster than cursor operations performing the same task However, one

poten-tial exception to this rule is if performing a large set-oriented operation lead to

locking concurrency issues Even though a single update runs faster than a cursor,

while it is running, the single update might end up locking the entire table, or large

portions of the table, for an extended period of time This would prevent other users

from accessing the table during the update If concurrent access to the table is more

important than the time it takes for the update itself to complete, you might want

to consider using a cursor

Trang 10

Consider using the MERGEstatement introduced in SQL Server 2008 when you need

to perform multiple updates against a table (UPDATE,INSERT, or DELETE) because it

enables you to perform these operations in a single pass of the table rather than

perform a separate pass for each operation

Consider using the OUTPUTclause to return results from INSERT,UPDATE, or DELETE

statements rather than having to perform a separate lookup against the table

Use search arguments that can be effectively optimized by the Query Optimizer Try

to avoid using any negative logic in your SARGs (for example, !=,<>,not in) or

performing operations on, or applying functions to, the columns in the SARG Avoid

using expressions in your SARGs where the search value cannot be evaluated until

runtime (such as local variables, functions, and aggregations in subqueries) because

the optimizer cannot accurately determine the number of matching rows because it

doesn’t have a value to compare against the histogram values during query

optimiza-tion Consider putting the queries into stored procedures and passing in the value of

the expression as a parameter The Query Optimizer evaluates the value of a

parame-ter prior to optimization SQL Server evaluates the expression prior to optimizing the

stored procedure

Avoid data type mismatches on join columns

Avoid writing large complex queries whenever possible Complex queries with a

large number of tables and join conditions can take a long time to optimize It may

not be possible for the Query Optimizer to analyze the entire set of plan alternatives,

and it is possible that a suboptimal query plans could be chosen Typically, if a query

involves more than 12 tables, it is likely that the Query Optimizer will have to rely

on heuristics and shortcuts to generate a query plan and may miss some optimal

strategies

For more tips and information on coding effective and efficient queries, see Chapters 43,

“Transact-SQL Programming Guidelines, Tips, and Tricks,” and 35

Stored Procedure Guidelines

Use stored procedures for SQL execution from your applications Stored procedure

execution can be more efficient that ad hoc SQL due to reduced network traffic and

query plan caching for stored procedures

Use stored procedures to make your database sort of a “black box” as far as the as

your application code is concerned If all database access is managed through stored

procedures, the applications are shielded from possible changes to the underlying

database structures You can simply modify the existing stored procedures to reflect

the changes to the database structures without requiring any changes to the

front-end application code

Ensure that your parameter data types match the column data types they are being

compared against to avoid data type mismatches and poor query optimization

Định dạng
Số trang	10
Dung lượng	201,64 KB