A typical SQL Server database installation generally uses a single RAID array to spread I/O across disks and create all files in the primary filegroup; more advanced installations or ins
Trang 1CHAPTER 38 Database Design and Performance
Au_id
Au_Iname
Au_fname
SocialSec#
BirthDate
Homephone
Workphone
Cellphone
Addr1
Addr2
City
Zip
State
Authors
Au_id Au_Iname Au_fname SocialSec#
Author_primary
Au_id BirthDate Homephone Workphone Cellphone Addr1 Addr2 City Zip State Author_secondary
FIGURE 38.2 Vertical partitioning of data
which in turn reduces the number of I/Os on the table Vertical splitting is a method of
reducing the width of a table by splitting the columns of the table into multiple tables
Usually, all frequently used columns are kept in one table, and others are kept in the other
table This way, more records can be accommodated per page, fewer I/Os are generated,
and more data can be cached into SQL Server memory Figure 38.2 illustrates a vertically
partitioned table The frequently accessed columns of the authorstable are stored in the
author_primarytable, whereas less frequently used columns are stored in the
author_secondarytable
TIP
Make the decision to split data very carefully, especially when the system is already in
production Changing the data structure might have a system-wide impact on a large
number of queries that reference the old definition of the object In such cases, to
min-imize risks, you might want to use SQL Server views to hide the vertical partitioning of
data Also, if you find that users and developers are frequently joining between the
ver-tically split tables because they need to pull data together from the two tables, you
might want to reconsider the split point or the splitting of the table itself Doing
fre-quent joins between split tables with smaller rows requires more I/Os to retrieve the
same data than if the data resided in a single table with wider rows
Trang 2Performance Implications of Zero-to-One Relationships
Suppose that one of the development managers in your company, Bob, approaches you to
discuss some database schema changes He is one of several managers whose groups all use
the centralUsertable in your database Bob’s application makes use of about 5% of the
users in theUsertable Bob has a requirement to track five yes/no/undecided flags
associ-ated with those users He would like you to add five one-character columns to theUser
table to track this information What do you tell Bob?
Bob has a classic zero-to-one problem He has some data he needs to track, but it applies
to only a small subset of the data in the table You can approach this problem in one of
three ways:
Option 1: Add the columns to the User table—In this case, 95% of your users
will have NULLvalues in those columns, and the table will become wider for
every-body
Option 2: Create a new table with a vertical partition of the User table—The
new table will contain the Userprimary key and Bob’s five flags In this case, 95% of
your users will still have NULLdata in the new table, but the Usertable is protected
against these effects Because other groups don’t need to use the new partition table,
this is a nice compromise
Option 3: Create a new vertically partitioned table as in Option 2 but
popu-late it only with rows that have at least one non-NULL value for the columns in
the new partition—This option is great for database performance, and searches in
the new table will be wonderfully fast The only drawback to this approach is that
Bob’s developers will have to add additional logic to their applications to determine
whether a row exists during updates Bob’s folks will need to use an outer join to the
table to cover the possibility that a row doesn’t exist
Depending on the goals of the project, any one of these options can be appropriate
Option 1 is simple and is the easiest to code for and understand Option 2 is a good
compromise between performance and simplicity Option 3 gives the best performance in
certain circumstances but impacts performance in certain other situations and definitely
requires more coding work to be done
Database Filegroups and Performance
Filegroups allow you to decide where on disk a particular object should be placed You can
do this by defining a filegroup within a database, extending the database onto a different
drive or set of drives, and then placing a database object on the new filegroup
Every database, by default, has a primary filegroup that contains the primary data file
There can be only one primary filegroup This primary filegroup contains all the pages
Trang 3CHAPTER 38 Database Design and Performance
assigned to system tables It also contains any additional database files created without
specifying filegroup Initially, the primary filegroup is also the default file group There
can be only one default filegroup, and indexes and tables that are created without
specify-ing a filegroup are placed in the default filegroup You can change the default filegroup to
another filegroup after it has been created for a database
In addition to the primary filegroup, you can add one or more additional filegroups to the
database that are named user-defined filegroups Each of those filegroups can contain one
or more files The main purpose of using filegroups is to provide more control over the
placement of files and data on the server When you create a table or an index, you can
map it to a specific filegroup, thus controlling the placement of data A typical SQL Server
database installation generally uses a single RAID array to spread I/O across disks and
create all files in the primary filegroup; more advanced installations or installations with
very large databases spread across multiple array sets can benefit from the finer level of
control of file and data placement afforded by additional filegroups
For example, for a simple database such as AdventureWorks2008, you can create just one
primary file that contains all data and objects and a log file that contains the transaction
log information For a larger and more complex database, such as a securities trading
system, where large data volumes and strict performance criteria are the norm, you might
create the database with one primary file and four secondary files You can then set up
filegroups so you can place the data and objects within the database across all five files If
you have a table that itself needs to be spread across multiple disk arrays for performance
reasons, you can place multiple files in a filegroup, each of which resides on a different
disk, and create the table on that filegroup For example, you can create three files
(Data1.ndf,Data2.ndf, and Data3.ndf) on three disk arrays and then assign them to the
filegroup called spread_group Your table can then be created specifically on the
spread_groupfilegroup Queries for data from the table are then spread across the three
disk arrays, thereby improving I/O performance
Filegroups are most often used in high-performance environments to isolate key tables or
indexes on their own set of disks, which are in turn typically part of a high-performance
RAID array Assuming that you start with a database that has just a PRIMARYfilegroup (the
default), the following example shows how you would add an index filegroup on a new
drive and move some nonclustered indexes to it:
add the filegroup
alter database Grocer
add filegroup FG_INDEX
Create a new database file and add it to the FG_INDEX filegroup
alter database Grocer
add file(
NAME = Grocer_Index,
FILENAME = ‘g:\Grocer_Index.ndf’,
SIZE = 2048MB,
MAXSIZE = 8192MB,
Trang 4FILEGROWTH = 10%
) to filegroup FG_INDEX
create nonclustered index xOrderDetail_ScanDT
on OrderDetail(ScanDT)
on FG_INDEX
Moving the indexes to a separate RAID array minimizes I/O contention by spreading out
the I/O generated by updates to the data that affect data rows and require changes to
index rows as well
NOTE
Because the leaf level of a clustered index is the data page, if you create a clustered
index on a filegroup, the entire table moves from the existing filegroup to the new
file-group If you want to put indexes on a separate filegroup, you should reserve that
space for nonclustered indexes only
Having your indexes on a separate filegroup gives you the following advantages:
Index scans and index page reads come from a separate disk, so they need not
compete with other database processes for disk time
Inserts, updates, and deletes on the table are spread across two separate disk arrays
The clustered index, including all the table data, is on a separate array from the
nonclustered indexes
You can target your budget dollars more precisely because the faster disks improve
system performance more if they are given to the index filegroup rather than the
database as a whole
The next section gives specific recommendations on how to architect a hardware solution
based on using separate filegroups for data and indexes
RAID Technology
Redundant array of inexpensive disks (RAID) is used to configure a disk subsystem to
provide better performance and fault tolerance for an application The basic idea behind
using RAID is that you spread data across multiple disk drives so that I/Os are spread
across these drives RAID has special significance for database-related applications, where
you want to spread random I/Os (data changes) and sequential I/Os (for the transaction
Trang 5CHAPTER 38 Database Design and Performance
log) across different disk subsystems to minimize disk head movement and maximize I/O
performance
The four significant levels of RAID implementation that are of most interest in database
implementations are as follows:
RAID 0 is data striping with no redundancy or fault tolerance
RAID 1 is mirroring, where every disk in the array has a mirror (copy)
RAID 5 is striping with parity, where parity information for data on one disk is
spread across the other disks in the array The contents of a single disk can be
re-created from the parity information stored on the other disks in the array
RAID 10, or 1+0, is a combination of RAID 1 and RAID 0 Data is striped across all
drives in the array, and each disk has a mirrored duplicate, offering the fault
toler-ance of RAID 1 with the performtoler-ance advantages of RAID 0
RAID Level 0
RAID Level 0 provides the best I/O performance among all other RAID levels A file has
sequential segments striped across each drive in the array Data is written in a
round-robin fashion to ensure that data is evenly balanced across all drives in the array
However, if a media failure occurs, no fault tolerance is provided, and all data stored in
the array is lost RAID 0 should not be used for a production database where data loss or
loss of system availability is not acceptable RAID 0 is occasionally used fortempdbto
provide the best possible read and (especially) write performance RAID 0 is helpful for
random read requirements, such as those that occur ontempdband in data segments
TIP
Although the data stored in tempdbis temporary and noncritical data, failure of a RAID
0 stripeset containing tempdbresults in loss of system availability because SQL Server
requires a functioning tempdbto carry out many of its activities If loss of system
avail-ability is not an option, you should not put tempdbon a RAID 0 array You should use
one of the RAID technologies that provides redundancy
If momentary loss of system availability is acceptable in exchange for the improved I/O
and reduced cost of RAID 0, recovery of tempdbis relatively simple The tempdb
data-base is re-created each time the SQL Server instance is restarted If the disk that
con-tained your tempdbwas lost, you could replace the failed disk, restart SQL Server, and
the files would automatically be re-created This scenario is complicated if the failed
disk with the tempdbfile also contains your masterdatabase or other system
databas-es See Chapter 14, “Database Backup and Restore,” for a more detailed discussion
of restoring system databases
Trang 6RAID 0 is the least expensive of the RAID configurations because 100% of the disks in the
array are available for data, and none are used to provide fault tolerance Performance is
also the best of the RAID configurations because there is no overhead required to
main-tain redundant data
Figure 38.3 depicts a RAID 0 disk array configuration
RAID Level 1
With RAID 1, known as disk mirroring, every write to the primary disk is written to the
mirror set Either member of the set can satisfy a read request RAID 1 devices provide
excellent fault tolerance because in the event of a media failure, either on the primary disk
or mirrored disk, the system can still continue to run Writes are much faster than with
RAID 5 arrays because no parity information needs to be calculated first The data is
simply written twice
RAID 1 arrays are best for transaction logs and index filegroups RAID 1 provides the best
fault tolerance and best write performance, which is critical to log and index performance
Because log writes are sequential write operations and not random access operations, they
are best supported by a RAID 1 configuration
RAID 1 arrays are the most expensive RAID configurations because only 50% of total disk
space is available for actual storage The rest is used to provide fault tolerance
Figure 38.4 shows a RAID 1 configuration
Array
Controller
D1 D5 D9
D2 D6 D10
D3 D7 D11
D4 D8 D12
FIGURE 38.3 RAID Level 0
Trang 7CHAPTER 38 Database Design and Performance
Disk 1
Disk 2
Disk 3
Disk 4
Disk 1
Disk 2
Disk 3
Disk 4
Array
Controller
FIGURE 38.4 RAID Level 1
Because RAID 1 requires that the same data be written to two drives at the same time,
write performance is slightly less than when writing data to a single drive because the
write is not considered complete until both writes have been done Using a disk controller
with a battery-backed write cache can mitigate this write penalty because the write is
considered complete when it occurs to the battery-backed cache The actual writes to the
disks occur in the background
RAID 1 read performance is often better than that of a single disk drive because most
controllers now support split seeks Split seeks allow each disk in the mirror set to be read
independently of each other, thereby supporting concurrent reads
RAID Level 10
RAID 10, or RAID 1+0, is a combination of mirroring and striping It is implemented as a
stripe of mirrored drives The drives are mirrored first, and then a stripe is created across
the mirrors to improve performance This should not be confused with RAID 0+1, which
is different and is implemented by first striping the disks and then mirroring
Many businesses with high-volume OLTP applications opt for RAID 10 configurations The
shrinking cost of disk drives and the heavy database demands of today’s business
applica-tions are making this a much more viable option If you find that your transaction log or
index segment is pegging your RAID 1 array at 100% usage, you can implement a RAID 10
array to get better performance This type of RAID carries with it all the fault tolerance
(and cost!) of a RAID 1 array, with all the performance benefits of RAID 0 striping
Trang 8RAID Level 5
RAID 5 is most commonly known as striping with parity In this configuration, data is
striped across multiple disks in large blocks At the same time, parity bits are written across
all the disks for a given block Information is always stored in such a way that any one disk
can be lost without any information in the array being lost In the event of a disk failure,
the system can still continue to run (at a reduced performance level) without downtime by
using the parity information to reconstruct the data lost on the missing drive
Some arrays provide “hot-standby” disks The RAID controller uses the standby disk to
rebuild a failed drive automatically, using the parity information stored on all the other
drives in the array During the rebuild process, performance is markedly worse
The fault tolerance of RAID 5 is usually sufficient, but if more than one drive in the array
fails, you lose the entire array It is recommended that a spare drive be kept on hand in
the event of a drive failure, so the failed drive can be replaced quickly before any other
drives fail
NOTE
Many of the RAID solutions available today support “hot-spare” drives A hot-spare
drive is connected to the array but doesn’t store any data When the RAID system
detects a drive failure, the contents of the failed drive are re-created on the hot-spare
drive, and it is automatically swapped into the array in place of the failed drive The
failed drive can then be manually removed from the array and replaced with a working
drive, which becomes the new hot spare
RAID 5 provides excellent read performance but expensive write performance A write
oper-ation on a RAID 5 array requires two writes: one to the data drive and one to the parity
drive After the writes are complete, the controller reads the data to ensure that the
infor-mation matches (that is, that no hardware failure has occurred) A single write operation
causes four I/Os on a RAID 5 array For this reason, putting log files ortempdbon a RAID 5
array is not recommended Index filegroups, which suffer worse than data filegroups from
bad write performance, are also poor candidates for RAID 5 arrays Data filegroups where
more than 10% of the I/Os are writes are also not good candidates for RAID 5 arrays
Note that if write performance is not an issue in your environment—for example, in a
DSS/data warehousing environment—you should, by all means, use RAID 5 for your data
and index segments
In any environment, you should avoid putting tempdbon a RAID 5 array tempdbtypically
receives heavy write activity, and it performs better on a RAID 1 or RAID 0 array
RAID 5 is a relatively economical means of providing fault tolerance No matter how
many drives are in the array, only the space equivalent to a single drive is used to support
Trang 9CHAPTER 38 Database Design and Performance
Array
Controller
D1 D5 Parity
D2 Parity D6
Parity D3 D7
D4 D8 Parity
FIGURE 38.5 RAID Level 5
fault tolerance This method becomes more economical with more drives in the array You
must have at least three drives in a RAID 5 array Three drives would require that 33% of
available disk space be used for fault tolerance, four would require 25%, five would require
20%, and so on
Figure 38.5 shows a RAID 5 configuration
NOTE
Although the recommendations for using the various RAID levels presented here can
help ensure that your database performance will be optimal, reality often dictates that
your optimum disk configuration might not be available You may be given a server with
a single RAID 5 array and told to make it work Although RAID 5 is not optimal for
tempdbor transaction logs, the write performance can be mitigated by using a
controller with a battery-backed write cache
If possible, you should also try to stripe database activity across multiple RAID 5 arrays
rather than a single large RAID 5 array to avoid overdriving the disks in the array
SQL Server and SAN Technology
With to the increased use of storage area networks (SANs) in SQL Server environments, it
is important to understand the design and performance implications of implementing
SQL Server databases on SANs SANs are becoming increasingly more common in SQL
Server environments these days for a number of reasons:
Trang 10Increasing database sizes
The increasing prevalence of clustered environments
The performance advantages and storage efficiencies and flexibilities of SANs
The increasing needs of recoverability and disaster recovery
Simplified disk administration
In large enterprises, a SAN can be used to connect multiple servers to a centralized pool of
disk storage Compared to managing hundreds of servers, each with its own separate disk
arrays, SANs help simplify disk administration by treating all the company’s storage as a
single resource Disk allocation, maintenance, and routine backups are easier to manage,
schedule, and control In some SANs, the disks themselves can copy data to other disks for
backup without any processing overhead at the host computers
What Is a SAN?
A SAN contains multiple high-performance hard drives coupled with high-performance
caching controllers The hard drives are often configured into various RAID
configura-tions These drive configurations are virtualized so that the consumer does not know
which hard drives a SQL Server or other device connected to the SAN will access
Essentially, the SAN presents blocks of storage to servers that can consist of a single hard
drive, multiple hard drives, or portions of hard drives in a logical unit called a Logical
Unit Number (LUN) Connection to a SAN is typically through fiber channel, a high-speed
optical network
SANS can provide advantages over locally attached storage Most SANs provide features
that allow you to clone, snapshot, or rapidly move data (replicate) from one location to
another, much faster than file copies or data transfers over your network This increases
the usefulness of SANs for disaster recovery SANs also provide a shared disk resource for
building server clusters, even allowing a cluster or server to boot off a SAN
Another reason for the increased use of SANs is that they offer increased utilization of
storage With locally attached storage, large amounts of disk space can end up being
wasted With a SAN, you can expand or contract the amount of disk space allocated to a
server or cluster as needed
Due to their cost and complexity, however, SANs are not for everybody They only really
make sense in large enterprises They are not a good choice for small environments with
relatively small databases, for companies with limited budgets (SANs are expensive), or for
companies that require disaster recovery on only one or a few SQL Servers
SAN Considerations for SQL Server
Before you rush out and purchase a SAN or two for your SQL Server environments, there
are some considerations to keep in mind when using SANs with SQL Server