The main difference between a nonclustered index and clustered index is that the leaf row of a nonclustered index is independent of the data rows in the table.. This pointer is either th
Trang 1Dave Bob Amy
Zelda
Elizabeth
Elizabeth
George George
Amy
Sam
Sam
Alexis, Amy,
Root Page
Intermediate
Page
Data Page
Amundsen, Fred,
Baker, Joe,
Best, Elizabeth,
Albert, John,
Masonelli, Irving,
Narin, Anabelle,
Naselle, Amy,
Neat, Juanita Mason, Emma,
Zelda
Amy Amy
Emma
Leaf Page
Anabelle
FIGURE 25.2 A simplified diagram of a nonclustered index
A nonclustered index is also structured as a B-tree Figure 25.2 shows a simplified diagram
of a nonclustered index defined on a first name column
As with a clustered index, in a nonclustered index, all index key values are stored in the
nonclustered index levels in sorted order, based on the index key(s) This sort order is
typi-cally different from the sort order of the table itself The main difference between a
nonclustered index and clustered index is that the leaf row of a nonclustered index is
independent of the data rows in the table The leaf level of a nonclustered index contains
a row for every data row in the table, along with a pointer to locate the data row This
pointer is either the clustered index key for the data row, if the table has a clustered index
on it, or the data page ID and row ID of the data row if the table is stored as a heap
struc-ture (that is, if the table has no clustered index defined on it)
To locate a data row via a nonclustered index, SQL Server starts at the root node and
navi-gates through the appropriate index pages in the intermediate levels of the index until it
reaches the leaf page, which should contain the index key for the desired data row It then
scans the keys on the leaf page until it locates the desired index key value SQL Server
then uses the pointer to the data row stored with the index key to retrieve the
correspond-ing data row
Trang 2NOTE
For a more detailed discussion of clustered tables versus heap tables (that is, tables
with no clustered indexes) and more detailed descriptions of clustered and
nonclus-tered index key structures and index key rows, as well as how SQL Server internally
maintains indexes, see Chapter 34
The efficiency of the index lookup and the types of lookups should drive the selection of
nonclustered indexes In the book index example, a single page reference is a very simple
lookup for the book reader and requires little work If, however, many pages are referenced
in the index, and those pages are spread throughout the book, the lookup is no longer
simple, and much more work is required to get all the information
You should choose your nonclustered indexes with the book index example in mind You
should consider using nonclustered indexes for the following:
Queries that do not return large result sets
Columns that are frequently used in the WHERE clause that return exact matches
Columns that have many distinct values (that is, high cardinality)
All columns referenced in a critical query (a special nonclustered index called a
covering index that eliminates the need to go to the underlying data pages)
Having a good understanding of your data access is essential to creating nonclustered
indexes Fortunately, SQL Server comes with tools such as the SQL Server Profiler and
Database Engine Tuning Advisor that can help you evaluate your data access paths and
determine which columns are the best candidates SQL Profiler is discussed in more detail
in Chapter 6, “SQL Server Profiler.” In addition, Chapter 34 discusses the use of the SQL
Server Profiler and Database Engine Tuning Advisor to assist in developing an optimal
indexing strategy
Creating Indexes
The following sections examine the most common means for creating indexes in SQL
Server Microsoft provides several different methods for creating indexes, each of which
has advantages The method used is often a matter of personal preference, but there are
situations in which a given method has distinct advantages
Creating Indexes with T-SQL
Transact-SQL (T-SQL) is the most fundamental means for creating an index This method
was available in all previous versions of SQL Server It is a very powerful option for
creat-ing indexes because the T-SQL statements that create indexes can be stored in a file and
Trang 3TABLE 25.1 Arguments for CREATE INDEX
UNIQUE Indicates that no two rows in the index can have
the same index key values Inserts into a table with a UNIQUE index will fail if a row with the same value already exists in the table
CLUSTERED | NON-CLUSTERED Defines the index as clustered or nonclustered
NON-CLUSTERED is the default Only one clus-tered index is allowed per table
index_name Specifies the name of the index to be created
object Specifies the name of the table or view to be
indexed
column_name Specifies the column or columns that are to be
indexed
ASC | DESC Specifies the sort direction for the particular
index column ASC creates an ascending sort order and is the default The DESC option causes the index to be created in descending order
INCLUDE (column [ , n ] ) Allows a column to be added to the leaf level of
an index without being part of the index key
This is a new argument
run as part of a database installation or upgrade In addition, T-SQL scripts that were used
in prior SQL Server versions to create indexes can be reused with very little change
You can create indexes by using the T-SQL CREATE INDEX command Listing 25.1 shows
the basic CREATE INDEX syntax Refer to SQL Server 2008 Books Online for the full syntax
LISTING 25.1 CREATE INDEX Syntax
CREATE [ UNIQUE ] [ CLUSTERED | NONCLUSTERED ] INDEX index_name
ON <object> ( column [ ASC | DESC ] [ , n ] )
[ INCLUDE ( column_name [ , n ] ) ]
[ WHERE <filter_predicate> ]
[ WITH ( <relational_index_option> [ , n ] ) ]
Table 25.1 lists the CREATE INDEX arguments
Trang 4TABLE 25.1 Arguments for CREATE INDEX
WHERE <filter_predicate> This argument, new to SQL Server 2008, is
used to create a filtered index The filter_predicate contains a WHERE clause that limits the number of rows in the table that are included in the index
relational_index_option Specifies the index option to use when creating
the index
Following is a simple example using the basic syntax of the CREATE INDEX command:
CREATE NONCLUSTERED INDEX [NC_Person_LastName]
ON [Person].[Person]
(
[LastName] ASC
)
This example creates a nonclustered index on the person.person table, based on the
LastName column The NONCLUSTERED and ASC keywords are not necessary because they
are the defaults Because the UNIQUE keyword is not specified, duplicates are allowed in the
index (that is, multiple rows in the table can have the same LastName)
Unique indexes are more involved because they serve two roles: they provide fast access to
the data via the index’s columns, but they also serve as a constraint by allowing only one
row to exist on a table for the combination of column values in the index They can be
clustered or nonclustered Unique indexes are also defined on a table whenever you define
a unique or primary key constraint on a table The following example shows the creation
of a nonclustered unique index:
CREATE UNIQUE NONCLUSTERED INDEX [AK_CreditCard_CardNumber]
ON [Sales].[CreditCard]
(
[CardNumber] ASC
)
This example creates a nonclustered index named AK_CreditCard_CardNumber on the
Sales.CreditCard table This index is based on a single column in the table When it is
created, this index prevents credit card rows with the same credit card number from being
inserted into the CreditCard table
Trang 5TABLE 25.2 Relational Index Options for CREATE INDEX
PAD_INDEX = {ON | OFF} Determines whether free space is allocated to the
non-leaf-level pages of an index The percentage of free space is determined by FILLFACTOR
FILLFACTOR = fillfactor Determines the amount of free space left in the
leaf level of each index page The fillfactor
values represent a percentage, from 0 to 100 The default value is 0 If fillfactor is 0 or 100, the index leaf-level pages are filled to capacity, leaving only enough space for at least one more row to be inserted
SORT_IN_TEMPDB = {ON | OFF} Specifies whether intermediate sort results that
are used to create the index are stored in tempdb Using them can speed up the creation of the index (if tempdb is on a separate disk), but it requires more disk space
IGNORE_DUP_KEY = {ON | OFF} Determines whether multirow inserts will fail when
duplicate rows in the insert violate a unique index
When this option is set to ON, duplicate key values are ignored, and the rest of the multirow insert succeeds When it is OFF (the default), the entire multirow insertfails if a duplicate is encountered
STATISTICS_NO_RECOMPUTE = {ON | OFF} Determines whether distribution statistics used by
the Query Optimizer are recomputed When ON, the statistics are not automatically recomputed
DROP_EXISTING = {ON | OFF} Determines whether an index with the same name
is dropped prior to re-creation This can provide some performance benefits over dropping the exist-ing index first and then createxist-ing Clustered indexes see the most benefit
ONLINE = {ON | OFF} Determines whether the index is built such that the
underlying table is still available for queries and data modification during the index creation This new feature is discussed in more detail in the
“Online Indexing Operations” section, later in this chapter
The relational index options listed in Table 25.2 allow you to define more sophisticated
indexes or specify how an index is to be created
Trang 6TABLE 25.2 Relational Index Options for CREATE INDEX
ALLOW_ROW_LOCKS = {ON | OFF} Determines whether row locks are allowed when
accessing the index The default for this new feature is ON
ALLOW_PAGE_LOCKS = {ON | OFF} Determines whether page locks are allowed when
accessing the index The default for this new feature is ON
MAXDOP = number of processors Determines the number of processors that can be
used during index operations The default for this new feature is 0, which causes an index operation
to use the actual number of processors or fewer, depending on the workload on the system This can be a useful option for index operations on large tables that may impact performance during the operation For example, if you have four proces-sors, you can specify MAXDOP = 2 to limit the index operation to use only two of the four processors
DATA_COMPRESSION = { NONE | ROW |
PAGE} [ ON PARTITIONS ( {
<parti-tion_number_expression> | <range> }
[ , n ] )
Determines whether data compression is used on the specified index The compression can be done
on the row or page level and specific index tions can be compressed if the index uses parti-tioning
The following example creates a more complex index that utilizes several of the index
options described in Table 25.2:
CREATE NONCLUSTERED INDEX [
IX_Person_LastName_FirstName_MiddleName] ON [Person].[Person]
(
[LastName] ASC,
[FirstName] ASC,
[MiddleName] ASC
)WITH (SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF,
FILLFAC-TOR=80)
This example creates a nonclustered composite index on the person’s last name
(LastName), first name (FirstName), and middle name (MiddleName) It utilizes some of the
commonly used options and demonstrates how multiple options can be used in a single
CREATE statement
Trang 7TIP
SQL Server Management Studio (SSMS) has several methods for generating the T-SQL
code that creates indexes You therefore rarely need to type index CREATE statements
from scratch Instead, you can use the friendly GUI screens that enable you to specify
the common index options, and then you can generate the T-SQL script that can be
executed to create the index
Additional syntax options (not listed here) relate to backward compatibility and the
creation of indexes on XML columns Refer to Chapter 47, “Using XML in SQL Server
2008,” and the SQL Server Books Online documentation for further details
Creating Indexes with SSMS
SQL Server 2008 has many options for creating indexes within SSMS You can create
indexes within SSMS via the Database Engine Tuning Advisor, database diagrams, the
Table Designer, and several places within the Object Explorer The means available from
the Object Explorer are the simplest to use and are the focus of this section The other
options are discussed in more detail in related chapters of this book
Index creation in the Object Explorer is facilitated by the New Index screen You can
launch this screen from SMSS by expanding the database tree in the Object Explorer and
navigating to the Indexes node of the table that you want to add the index to Then you
right-click the Indexes node and select New Index A screen like the one shown in Figure
25.3 is displayed
The name and options that are populated in Figure 25.3 are based on the person index
created in the previous T-SQL section The LastName, FirstName, and MiddleName columns
were selected and added as part of this new index by clicking the Add button, which
displays a screen with all the columns in the table that are available for the index You
simply select the column(s) you want to include on the index This populates the Index
Key Columns grid on the default General page
You can select other options for an index by changing the Select a Page options available
on the top-left side of the New Index screen The Options, Included Columns, Storage,
Spatial, and Filter pages each provide a series of options that relate to the corresponding
category and are utilized when creating the index
Of particular interest is the Included Columns page This page allows you to select
columns that you want to include in the leaf-level pages of the index but don’t need as
part of the index key For example, you could consider using included columns if you
have a critical query that often selects last name, first name, and address from a table but
uses only the last name and first name as search arguments in the WHERE clause This may
be a situation in which you would want to consider the use of a covering index that places
all the referenced columns from the query into a nonclustered index In the case of our
critical query, the address column can be added to the index as an included column It is
not included in the index key, but it is available in the leaf-level pages of the index so that
Trang 8ptg FIGURE 25.3 Using Object Explorer to create indexes
The Spatial and Filter option pages are new to SQL Server 2008 The Spatial page can be
used to create spatial indexes on a column that is defined as a spatial data type; that is
either type geometry or geography If your table contains a column of this data type, you
can use the Index Type drop-down to change the index type to Spatial After this is done,
you can add a column that is defined as a spatial data type to the index Finally, you can
select the Spatial option page, as shown in Figure 25.4, that allows you to fully define a
spatial index The meaning of the parameters on this page are beyond the scope of this
chapter and are discussed in more detail in Chapter 34
The Filter option page allows you to define a filtering criterion to limit the rows that are
included in the index The page, shown in Figure 25.5, is relatively simple with a single
input area that contains your filtering criterion This criterion is basically the contents of a
WHERE clause that is similar to what you would use in a query window to filter the rows in
your result The filter expression shown in Figure 25.5 was defined for an index on the
PersonType column, which is found in the Person.Person table of the
AdventureWorks2008 sample database Many of the rows in this table have a PersonType
value equal to ’IN’ so a filtered index that does not include rows with this value will
dramatically reduce the size of the index and make searches on values other than ’IN’
relatively fast
After selecting all the options you want for your index via the New Index screen, you have
several options for actually creating the index You can script the index, schedule the
index creation for a later time, or simply click OK to allow the New Index screen to add
Trang 9ptg FIGURE 25.4 Spatial Index options page
FIGURE 25.5 Filter Index options page
Trang 10screen to specify the index options, and then you can click the Script button to generate
all the T-SQL statements needed to create the index You can then save this script to a file
to be used for generating a database build script or for maintaining a record of the indexes
defined in a database
Managing Indexes
There are two different aspects to index management The first aspect is the management
of indexes by the SQL Server database engine Fortunately, the engine does a good job of
managing the indexes internally so that limited manual intervention is required This is
predicated on a well-designed database system and the use of SQL Server features, such as
automatic updates to distribution statistics
The other aspect of index management typically comes into play when performance issues
arise Index adjustments and maintenance of these indexes make up the bulk of this effort
Managing Indexes with T-SQL
One of the T-SQL features available with SQL Server 2008 is the ALTER INDEX statement
This statement simplifies many of the tasks associated with managing indexes Index
oper-ations such as index rebuilds and changes to fill factor that were previously handled with
DBCC commands are now available via the ALTER INDEX statement The basic syntax for
ALTER INDEX is as follows:
ALTER INDEX {index_name | ALL}
ON [{database_name.[schema_name] | schema_name.}]
{table_or_view_name}
{ REBUILD [WITH(<rebuild_index_option>[, n])]
| REORGANIZE [ WITH( LOB_COMPACTION = {ON | OFF})]
| DISABLE
| SET (<set_index_option>[, n]) }
Let’s look at a few examples that demonstrate the power of the ALTER INDEX statement
The first example simply rebuilds the primary key index on the Production.Product table:
ALTER INDEX [PK_Product_ProductID] ON [Production].[Product] REBUILD
This offline operation is equivalent to the DBCC DBREINDEX command The specified index
is dropped and re-created, removing all fragmentation from the index pages This is done
dynamically, without the need to drop and re-create constraints that reference any of the
affected indexes If it is run on a clustered index, the data pages of the table are
defrag-mented as well If you specify the ALL option for the ALTER INDEX command, all indexes
as well as the data pages of the table (if the table has a clustered index) are defragmented