TABLE 34.11 Differences Between Missing Index Features and Database Engine Tuning Advisor Comparison Point Missing Indexes Feature Database Engine Tuning Advisor Execution method Server
Trang 1CHAPTER 34 Data Structures, Indexes, and Performance
Table 34.11 details some other differences between the missing indexes feature and
Database Engine Tuning Advisor in greater detail
TABLE 34.11 Differences Between Missing Index Features and Database Engine Tuning
Advisor
Comparison Point Missing Indexes Feature Database Engine Tuning Advisor
Execution method Server side, always on Client-side, standalone application
Scope of analysis Quick, ad hoc analysis, providing
limited information about missing indexes only
Thorough workload analysis, provid-ing full recommendation report about the best physical database design configuration |
in the context of a submitted work-load
Statements analyzed SELECT statements only SELECT, UPDATE, INSERT, and
DELETE Available disk storage
space
Not factored into analysis Factored into analysis;
recommen-dations are not provided if they would exceed available storage space
Columns ordering Recommended index column
order not provided
Optimal index column order deter-mined based on query execution cost
Index type Nonclustered only Both clustered and nonclustered
index recommendations provided Indexed views
recommendations
Not provided Recommended in supported
editions
Partitioning
recommendations
Not provided Recommended in supported
editions
Impact analysis An approximate impact of adding
a missing index is reported via the
sys.dm_db_missing_index_gro up_stats dynamic management view (DMV)
Up to 15 different analysis reports generated to provide information about the impact of implementing recommendations
Trang 2Identifying Unused Indexes
As mentioned previously in this chapter, each index on a table adds additional overhead
for data modifications because the indexes also need to be maintained as changes are made
to index key columns In an OLTP environment, excessive indexes on your tables can be
almost as much of a performance issue as missing indexes To improve OLTP performance,
you should limit the number of indexes on your tables to only those absolutely needed;
you definitely should eliminate any unnecessary and unused indexes that may be defined
on your tables to eliminate the overhead they introduce
Fortunately, SQL Server provides a DMV that you can use to identify which indexes in
your database are not being used: sys.dm_db_index_usage_stats The columns in the
sys.dm_db_index_usage_stats are shown in Table 34.12
TABLE 34.12 Columns in the sys.dm_db_index_usage_stats DMV
Column Name Description
database_id ID of the database on which the table or view is defined
object_id ID of the table or view on which the index is defined
index_id ID of the index
user_seeks Number of seeks by user queries
user_scans Number of scans by user queries
user_lookups Number of bookmark lookups by user queries
user_updates Number of updates by user queries
last_user_seek Time of last user seek
last_user_scan Time of last user scan
last_user_lookup Time of last user lookup
last_user_update Time of last user update
system_seeks Number of seeks by system queries
system_scans Number of scans by system queries
system_lookups Number of lookups by system queries
system_updates Number of updates by system queries
last_system_seek Time of last system seek
last_system_scan Time of last system scan
Trang 3CHAPTER 34 Data Structures, Indexes, and Performance
Every individual seek, scan, lookup, or update on an index by a query execution is
counted as a use of that index, and the corresponding counter in the view is incremented
Thus, you can run a query against this DMV to see whether there are any indexes that
your queries are not using, that is, indexes that either have no rows in the DMV or have 0
values in the user_seeks, user_scans, or user_lookups columns (or the time values of the
last_user_* columns are significantly in the past) You especially need to focus on any
indexes that don’t show any user query activity but do have a high value in the
last_user_update column This indicates an index that’s adding significant update
over-head but not being used by any queries for locating data rows
For example, the query shown in Listing 34.9 returns all indexes in the current database
that have never been accessed; that is, they would have no records at all in the
sys.dm_db_index_usage_stats table
LISTING 34.9 A Query for Unused Indexes
SELECT convert(varchar(12), OBJECT_SCHEMA_NAME(I.OBJECT_ID)) AS SchemaName,
convert(varchar(20), OBJECT_NAME(I.OBJECT_ID)) AS ObjectName,
convert(varchar(30), I.NAME) AS IndexName
FROM sys.indexes I
WHERE only get indexes for user created tables
OBJECTPROPERTY(I.OBJECT_ID, ‘IsUserTable’) = 1
ignore heaps
and I.index_id > 0
find all indexes that exist but are NOT used
AND NOT EXISTS (
SELECT index_id FROM sys.dm_db_index_usage_stats WHERE OBJECT_ID = I.OBJECT_ID
AND I.index_id = index_id AND database_id = DB_ID()) ORDER BY SchemaName, ObjectName, IndexName
Also, you should be aware that that the information is reported in the DMV both for
oper-ations caused by user-submitted queries and for operoper-ations caused by internally generated
queries, such as scans for gathering statistics If you run UPDATE STATISTICS on a table, the
sys.dm_db_index_usage_stats table will have a row for each index for the system scan
performed by the UPDATE STATISTICS command However, the index may still be unused
by any queries in your applications Consequently, you might want to modify the
previ-ous query to look for indexes with 0 values in the last_user_* columns instead of indexes
with no row at all in the DMV Listing 34.10 provides an alternative query
Trang 4LISTING 34.10 A Query for Indexes Unused by Appliation Queries
SELECT convert(varchar(12), OBJECT_SCHEMA_NAME(I.OBJECT_ID)) AS SchemaName,
convert(varchar(20), OBJECT_NAME(I.OBJECT_ID)) AS ObjectName,
convert(varchar(30), I.NAME) AS IndexName
FROM sys.indexes I
LEFT OUTER JOIN
sys.dm_db_index_usage_stats u
on I.index_id = u.index_id
and u.database_id = DB_ID()
WHERE only get indexes for user created tables
OBJECTPROPERTY(I.OBJECT_ID, ‘IsUserTable’) = 1
ignore heaps
and I.index_id > 0
find all indexes that exist but are NOT used
and isnull(u.last_user_seek, 0) = 0
and isnull(u.last_user_scan, 0) = 0
and isnull(u.last_user_lookup, 0) = 0
ORDER BY SchemaName, ObjectName, IndexName
Note that the information returned by sys.dm_db_index_usage_stats is useful only if
your server has been running long enough and has processed a sufficient amount of your
standard and peak workflow Also, you should be aware that the data in the DMV is
cleared each time SQL Server is restarted, or if a database is detached and reattached To
prevent losing useful information, you might want to create a scheduled job that
periodi-cally queries the DMVs and saves the information to your own tables so you can track the
information over time for more thorough and complete analysis
Trang 5CHAPTER 34 Data Structures, Indexes, and Performance
Summary
One of the most important aspects of improving SQL Server performance is proper table
and index design Choosing the appropriate indexes for SQL Server to use to process
queries involves thoroughly understanding the queries and transactions being run against
the database, understanding the bias of the data, understanding how SQL Server uses
indexes, and staying aware of the performance implications of overindexing tables in an
OLTP environment In general, you should consider using clustered indexes to support
range retrievals or when data needs to be sorted in clustered index order; you should use
nonclustered indexes for single- or discrete-row retrievals or when you can take advantage
of index covering
To really make good index design choices, you should have an understanding of the SQL
Server Query Optimizer to know how it uses indexes and index statistics to develop query
plans This would be a good time to continue on and read Chapter 35
Trang 6Understanding Query
Optimization
What’s New in Query Optimization
What Is the Query Optimizer?
Query Compilation and Optimization
Query Analysis Row Estimation and Index Selection
Join Selection Execution Plan Selection Query Plan Caching Other Query Processing Strategies
Parallel Query Processing Common Query Optimization Problems
Managing the Optimizer
Query optimization is the process SQL Server goes
through to analyze individual queries and determine the
best way to process them To achieve this end, SQL Server
uses a cost-based Query Optimizer As a cost-based Query
Optimizer, the Query Optimizer’s purpose is to determine
the query plan that will access the data with the least
amount of processing time in terms of CPU and logical and
physical I/O The Query Optimizer examines the parsed
SQL queries and, based on information about the objects
involved (for example, number of pages in the table, types
of indexes defined, index statistics), generates a query plan
The query plan is the set of steps to be carried out to
execute the query
To allow the Query Optimizer to do its job properly, you
need to have a good understanding of how the Query
Optimizer determines query plans for queries This
knowl-edge will help you to understand what types of queries can
be optimized effectively and to learn techniques to help the
Query Optimizer choose the best query plan This
knowl-edge will help you write better queries, choose better
indexes, and detect potential performance problems
NOTE
To better understand the concepts presented in this
chapter, you should have a reasonable understanding
of how data structures and indexes affect
perfor-mance If you haven’t already read Chapter 34, “Data
Structures, Indexes, and Performance,” it is
Trang 7CHAPTER 35 Understanding Query Optimization
NOTE
Occasionally throughout this chapter, graphical execution plans are used to illustrate
some of the principles discussed Chapter 36, “Query Analysis,” provides a more
detailed discussion of the graphical execution plan output and describes the
informa-tion contained in the execuinforma-tion plans and how to interpret it In this chapter, the
execu-tion plans are provided primarily to give you an idea of what you can expect to see for
the different types of queries presented when you are doing your own query analysis
What’s New in Query Optimization
SQL Server 2008 introduces a few new features and capabilities related to query
optimiza-tion and query performance in an attempt to deliver on the theme of “predictable
perfor-mance.” The primary new features and enhancements are as follows:
An enhancement has been added to the OPTIMIZE FOR query hint option to include
a new UNKNOWN option, which specifies that the Database Engine use statistical data
to determine the values for one or more local variables during query optimization,
instead of the initial values
Table hints can now be specified as query hints in the context of plan guides to
provide advanced query performance tuning options
A new FORCESEEK table hint has been added This hint specifies that the query
opti-mizer should use only an index seek operation as the access path to the data
refer-enced in the query
Hash values are available for finding and tuning similar queries The
sys.dm_exec_query_stats and sys.dm_exec_requests catalog views provide query
hash and query plan hash values that you can use to help determine the aggregate
resource usage for similar queries and similar query execution plans This can help
you find and tune similar queries that individually consume minimal system
resources but collectively consume significant system resources
The new filtered indexes feature in SQL Server 2008 is considered for estimating
index usefulness
Parallel query processing on partitioned objects has been improved
One of the key improvements in SQL Server 2008 is the simplification of the creation and
use of plan guides:
The sp_create_plan_guide stored procedure now accepts XML execution plan
output directly via the @hints parameter instead of having to embed the output in
the USE PLAN hint
A new stored procedure, sp_create_plan_guide_from_handle, allows you to create
one or more plan guides from an existing query plan in the plan cache
Trang 8A new system function, sys.fn_validate_plan_guide, enables you to validate a
plan guide
New SQL Profiler event classes, Plan Guide Successful and Plan Guide Unsuccessful,
enable you to verify whether plan guides are being used by the Query Optimizer
New Performance Monitor counters in the SQL Server, SQL Statistics Object—Guided
Plan Executions/sec and Misguided Plan Executions/sec—can be used to monitor the
number of plan executions in which the query plan has been successfully or
unsuc-cessfully generated by using a plan guide
Built-in support is now available for creating, deleting, enabling, disabling, or
script-ing plan guides in SQL Server Management Studio (SSMS) Plan guides now are
locat-ed in the Programmability folder in Object Explorer
NOTE
Many of the internals of the Query Optimizer and its costing algorithms are considered
proprietary and have not been made public Much of the information provided here is
based on analysis and observation of query plans generated for various queries and
search values
The intent of this chapter is therefore not so much to describe the specific steps,
algo-rithms, and calculations implemented by the Query Optimizer, but rather to provide a
general overview of the query optimization process in SQL Server 2008 and what goes
into estimating and determining an efficient query plan Also, there are a number of
possible ways SQL Server can optimize and process queries The examples presented
in this chapter focus on some of the more common optimization strategies
What Is the Query Optimizer?
For any given SQL statement, the source tables can be accessed in many ways to return
the desired result set The Query Optimizer analyzes all the possible ways the result set can
be generated and chooses the most appropriate method, called the query plan or execution
plan SQL Server uses a cost-based Query Optimizer The Query Optimizer assigns a cost to
every possible execution plan in terms of CPU resource usage and page I/O The Query
Optimizer then chooses the execution plan with the lowest associated cost
Thus, the primary goal of the Query Optimizer is to find the least expensive execution
plan that minimizes the total time required to process a query Because I/O is the most
significant factor in query processing time, the Query Optimizer analyzes the query and
primarily searches for access paths and techniques to minimize the number of logical and
physical page accesses as much as possible The lower the number of logical and physical
Trang 9CHAPTER 35 Understanding Query Optimization
currently employs This chapter is intended to help you better understand some of the
concepts related to how the Query Optimizer chooses an execution strategy and provide
an overview of the query optimization strategies employed to improve query processing
performance
Query Compilation and Optimization
Query compilation is the complete process from the submission of a query to its actual
execution There are many steps involved in query compilation—one of which is
opti-mization All T-SQL statements are compiled, but not all are optimized Primarily, only the
standard SQL Data Manipulation Language (DML) statements—SELECT, INSERT, UPDATE,
and DELETE—require optimization The other procedural constructs in T-SQL (IF, WHILE,
local variables, and so on) are compiled as procedural logic but do not require
optimiza-tion DML statements are set-oriented requests that the Query Optimizer must translate
into procedural code that can be executed efficiently to return the desired results
NOTE
SQL Server also optimizes some Data Definition Language (DDL) statements, such as
CREATE INDEX or ALTER TABLE, against the data tables For example, a displayed
query plan for the creation of an index shows optimization steps for accessing the
table, sorting data, and inserting into the index tree However, the focus in this chapter
is on optimization of DML statements
Compiling DML Statements
When SQL Server compiles an execution plan for a DML statement, it performs the
following basic steps:
1 The query is parsed and checked for proper syntax, and the T-SQL statements are
parsed into keywords, expressions, operators, and identifiers to generate a query tree
The query tree (sometimes referred to as the sequence tree) is an internal format of
the query that SQL Server can operate on It is essentially the logical steps needed to
transform the query into the desired result
2 The query tree is then normalized and simplified During normalization, the tables
and columns are verified, and the metadata (data types, null properties, index
statistics, and so on) about them is retrieved In addition, any views are resolved to
their underlying tables, and implicit conversions are performed (for example, an
integer compared with a float value) Also during this phase, any redundant
opera-tions (for example, unnecessary or redundant joins) are removed, and the query
tree is simplified
3 The Query Optimizer analyzes the different ways the source tables can be accessed
and selects the series of steps that return the results fastest while typically using the
fewest resources The query tree is updated with the optimized series of steps, and an
Trang 104 After the optimized execution plan is generated, SQL Server stores the optimized
plan in the procedure cache
5 SQL Server reads the execution plan from the procedure cache and executes the
query plan, returning the result set (if any) to the client
The optimized execution plan is then left in the procedure cache If the same query or
stored procedure is executed again and the plan is still available in the procedure cache,
the steps to optimize and generate the execution plan are skipped, and the stored query
execution plan is reused to execute the query or stored procedure
Optimization Steps
When the query tree is passed to the Query Optimizer, the Query Optimizer performs a
series of steps to break down the query into its component pieces for analysis to generate
an optimal execution plan:
1 Query analysis—The query is analyzed to determine search arguments and join
clauses A search argument is defined as a WHERE clause that compares a column to a
constant A join clause is a WHERE clause that compares a column from one table to a
column from another table
2 Row estimation and index selection—Indexes are selected based on search
argu-ments and join clauses (if any exist) Indexes are evaluated based on their
distribu-tion statistics and are assigned a cost
3 Join selection—The join order is evaluated to determine the most appropriate order
in which to access tables In addition, the Query Optimizer evaluates the most
appropriate join algorithm to match the data
4 Execution plan selection—Execution costs are evaluated, and a query execution plan
is created that represents the most efficient solution found by the optimizer
The next four sections of this chapter examine each of these steps in more detail
NOTE
Unless stated otherwise, the examples presented in this chapter operate on the tables
in the bigpubs2008 database A copy of the bigpubs2008 database is available on the
CD included with this book Instructions on how to install the database are presented
in the Introduction
Query Analysis