Hướng dẫn học Microsoft SQL Server 2008 part 146 pptx

Part IX Performance Tuning and OptimizationSQL Server data compression isn’t like .jpg compression, where you can choose the level of compression and more compression means more data los

Trang 1

Part IX Performance Tuning and Optimization

10 In the background, when a checkpoint occurs (a SQL Server internal event) or the lazy writer

runs, SQL Server writes any dirty (modified) data pages to the data file It tries to find sequen-tial pages to improve the performance of the write Even though I’ve listed it here as step 10, this can happen at nearly any point during the transaction or after it depending on the amount

of data being changed and the memory pressure on the system SQL Server receives a ‘‘write complete’’ message from Windows

11 At the conclusion of the background write operation, SQL Server marks the oldest open

transaction in the transaction log All older, committed transactions have been confirmed in the data file and are now confirmed in the transaction log TheDBCC OpenTrancommand reports the oldest open transaction

Transaction complete

The sequence comes full circle and returns the database to a consistent state

12 The database finishes in a consistent state.

Transaction-log rollback

If the transaction is rolled back, the DML operations are reversed in memory, and a transaction-abort

entry is made in the log More often than not, the time taken to perform a rollback will be greater than

the time taken to make the changes in the first place

Transaction log recovery

The primary benefit of a write-ahead transaction log is that it maintains the atomic transactional

property in the case of system failure

If SQL Server should cease functioning (perhaps due to a power failure or physical disaster), the

transac-tion log is automatically examined once it recovers, as follows:

■ If any entries are in the log as DML operations but are not committed, they are rolled back

■ To test this feature you must be brave Begin a transaction and shut down the SQL server before issuing aCOMMIT transaction (using the Services applet) This does a shutdown with nowait Simply closing Query Analyzer won’t do it; Query Analyzer will request permission to commit the pending transactions and will roll back the transactions if permission isn’t given If SQL Server is shut down normally (this varies greatly, as there are many ways to stop, some of which gracefully shut down, others which don’t), it will wait for any pending tasks to complete before stopping

■ If you have followed the steps outlined previously and you disable the system just before step 7, the transaction log entries will be identical to those shown later (refer to Figure 66-10)

■ Start SQL Server, and it will recover from the crash very nicely and roll back the unfinished transaction This can be seen in the SQL Server ErrorLog

■ If any entries are in the log as DML operations and committed but not marked as written to the data file, they are written to the data file This feature is nearly impossible to demonstrate

Trang 2

Transaction Performance Strategies

Transaction integrity theory can seem daunting at first, and SQL Server has numerous tools to control

transaction isolation If the database is low usage or primarily read-only, transaction locking and

blocking won’t be a problem However, for heavy-usage OLTP databases, you’ll want to apply the theory

and working knowledge from this chapter using these strategies Also if you are mixing reporting and

OLTP systems, you are facing large blocking issues, as reporting systems generally place locks at the

page or table level, which isn’t good for your OLTP system that wants row-level locks Because locking

and blocking comprise the fourth optimization strategy, ensure that steps one through three are covered

before tackling locking and blocking:

1 Begin with Smart Database Design: Start with a clean simplified schema to reduce the number

of unnecessary joins and reduce the amount of code used to shuttle data from bucket to

bucket

2 Use efficient based code, rather than painfully slow iterative cursors or loops Large

set-based operations can cause locking and blocking Chapter 22, ‘‘Kill the Cursor!,’’ explains how

to break up large set-based operations into smaller batches to alleviate this problem

3 Use a solid indexing strategy to eliminate unnecessary table scans and increase the speed of

transactions

To identify locking problems, use the Activity Monitor or SQL Profiler

To reduce the severity of a locking problem, do the following:

■ Evaluate and test using the read committed snapshot isolation level Depending on your error

handling and hardware capabilities, snapshot isolation can significantly reduce concurrency

contention

■ Check the transaction isolation level and ensure that it’s not any higher than required

■ Make sure transactions begin and commit quickly Redesign any transaction that includes a

cursor that doesn’t have to use a cursor Move any code that isn’t necessary to the transaction

out of the transaction unless it is needed to ensure transactional consistency

■ If two procedures are deadlocking, make sure they lock the resource in the same order

■ Make sure client applications access the database through the data abstraction layer

■ Consider forcing rowlocks locks with the (rowlock) hint to prevent the locks from escalating

Evaluating database concurrency performance

It’s easy to build a database that doesn’t exhibit lock contention and concurrency issues when tested

with a handful of users The real test occurs when several hundred users are all updating orders

Concurrency testing requires a concerted effort At one level, it can involve everyone available running

the same front-end form concurrently A NET program that constantly simulates a user viewing

data and updating data is also useful A good test is to run 20 instances of a script that constantly

pounds the database and then let the test crew use the application Performance Monitor (covered in

Chapter 55, ‘‘Performance Monitor’’) can watch the number of locks

Trang 3

Best Practice

Multi-user concurrency should be tested during the development process several times To quote the

MCSE exam guide, ‘‘ don’t let the real test be your first test.’’

Summary

A transaction is a logical unit of work Although the default SQL Server transaction isolation level works

well for most applications, there are several means of manipulating and controlling the locks To develop

a serious SQL Server application, your understanding of the ACID database principles, SQL Server’s

transaction log, and locking will contribute to the quality, performance, and reliability of the database

Major points from this chapter include the following:

■ Transactions must be ACID: atomic (all or nothing), consistent (before and after the trans-action), isolated (not affected by another transtrans-action), and durable (once committed always committed)

■ SQL Server transactions are durable because of the write-ahead transaction log

■ SQL Server transactions are isolated because of locks or snapshot isolation

■ Using traditional transaction isolation, readers block writers, and writers block readers and other writers

■ SQL Server offers four traditional transaction isolation levels: read uncommitted, read commit-ted, repeatable read, and serializable Read commitcommit-ted, the default transaction isolation level, is the right isolation for most OLTP databases

■ Never ever use read uncommitted (or theNOLOCKhint)

■ Snapshot isolation means reading the before image of the transaction instead of waiting for the transaction to commit Using snapshot isolation, readers don’t block writers, and writers don’t block readers; only writers block other writers

The next chapter continues the optimization theme with one of my favorite new features — data

com-pression High-transaction databases always struggle with I/O performance, and data compression is the

perfect solution for reducing I/O

Trang 4

Data Compression

IN THIS CHAPTER Understanding compression Reducing I/O

Whole-database compression procedures

Compression strategies

Pushing a database into the tens of thousands of transactions per second

requires massive amounts of raw I/O performance At those rates, today’s

servers can supply the CPU and memory, but I/O struggles By reducing

the raw size of the data, data compression trades I/O for CPU, improving

performance

Data compression is easy — easy to enable, and easy to benefit from, so why a

full chapter on data compression?

Data compression is the sleeper of the SQL Server 2008 new feature list Like

online indexing in SQL Server 2005, I believe that data compression will become

the compelling reason to upgrade for many large SQL Server IT shops

In other words, data compression doesn’t warrant an entire chapter because of its

complexity or length, but because of its value Its impact is such that it deserves

center stage, at least for this chapter

Understanding Data Compression

Every IT professional is familiar with data compression, such as zip files and jpg

compression, to name a couple of popular compression technologies

But SQL Server data compression is specific to the SQL Server storage engine and

has a few database-specific requirements First, there has to be zero risk of loss of

data fidelity Second, it has to be completely transparent — enabled without any

application code changes

Trang 5

SQL Server data compression isn’t like jpg compression, where you can choose the level of compression

and more compression means more data loss With SQL Server data compression, the data is

transpar-ently compressed by the storage engine and every compressed data page retains every data value when

decompressed

Don’t confuse data compression with backup compression — the two technologies are completely independent.

The following data objects may be compressed:

■ Entire heap

■ Entire clustered index

■ Entire non-clustered index

■ Entire indexed view (specifically, the materialized clustered index of an indexed view)

■ Single partition of partitioned table or index

While indexes can be compressed, they are not automatically compressed with the table’s compression type All objects, including indexes, must be individually, manually enabled for compression.

Data compression limitations:

■ Heaps or clustered indexes with sparse data may not be compressed

■ File stream data or LOB data is not compressed

■ Tables with rows that potentially exceed 8,060 bytes and use row overflow cannot be compressed

■ Data compression does not overcome the row limit The data must always be able to be stored uncompressed

Data compression pros and cons

Data compression offers several benefits and a few trade-offs, so while using data compression is

proba-bly a good thing, it’s worth understanding the pros and cons

The most obvious con is the financial cost Data compression is only available with the Enterprise

Edition If you already are using Enterprise Edition, great; if not, then moving from Standard to

Enterprise is a significant budget request

Data compression uses CPU If your server is CPU pressured, then turning on data compression will

probably hurt performance Depending on the data mix and the transaction rate, enabling data

compres-sion might slow down the application

Not all tables and indexes compress well In my testing, some objects will compress up to 70%, but

many tables will see little compression, or even grow in size when compressed Therefore, you shouldn’t

simply enable compression for every object; it takes some study and analysis to choose compression

wisely

With these three possible drawbacks understood, there are plenty of reasons to enable data compression

(assuming the data compresses well):

Trang 6

■ Data compression significantly reduces the I/O bottleneck for a high-transaction database.

■ Data compression significantly reduces the memory footprint of data, thus caching more data

in memory and probably improving overall performance

■ More rows on a page mean that scans andcount(*)type operations are faster

■ Compressed data means SAN shadow copies are smaller

■ Database snapshots are smaller and more efficient with data compression

■ SANS and high-performance disks are expensive Compressed data means less disk space is

required, which means more money is left in the budget to attend a SQL Server conference

in Maui

■ Compressed data means backup duration and restore duration is reduced, and less storage

space is used for backups

There are hardware-based data compression solutions that compress data as it’s written

to disk While these can reduce disk space and off-load the CPU overhead of

compres-sion, they fail to reduce the I/O load on SQL Server, or reduce the data’s memory footprint within

SQL Server.

There are two types, or levels, of data compression in SQL Server 2008: row level and page level Each

has a specific capability and purpose So you can best understand how and when to employ data

com-pression, the following sections describe how they work

Row compression

Row compression converts the storage of every fixed-length data type column (both character and numeric

data types) to a variable-length data type column Row compression grew out of the vardecimal

com-pression added with SQL Server 2005 SP2 Depending on the number of fixed-length columns and the

actual length of the data, this level may, or may not, provide significant gain

While you’ll still see the columns as fixed length when viewing the database, under the covers the

stor-age engine is actually writing the values as if the columns were variable length Achar(50)column is

treated as if it’s avarchar(50)column

When row compression is enabled, SQL Server also uses a new variable-length row format that reduces

the per-column metadata overhead from 2 bytes to 4 bits

Row-level data compression is designed specifically for third-party databases that have several

fixed-length columns but don’t allow schema changes

Page compression

SQL Server page compression automatically includes row compression and takes compression two steps

further, adding prefix compression and then dictionary compression Page compression applies only to

leaflevel pages (clustered or heaps) and not to the b-tree root or intermediate pages

Prefix compression may appear complex at first, but it’s actually very simple and efficient For prefix

compression the storage engine follows these steps for each column:

Trang 7

1 The storage engine examines all the values and selects the most common prefix value for the

data in the column

2 The longest actual value beginning with the prefix is then stored in the compression

informa-tion (CI) structure

3 If the prefix is present at the beginning of the data values, a number is inserted at the

begin-ning of the value to indicate n number of prefix characters of the prefix The non-prefix

portion of the value (the part to the right of the prefix) is left in place

Prefix compression actually examines bytes, so it applies to both character and numeric data

For example, assume the storage engine were applying prefix compression to the following data, which

includes two columns, shown in Figure 67-1

FIGURE 67-1

The sample data before page compression is enabled

Raw Data:

Page Header

Nielsen Nelson Nelsen Nelsen

Paul Joe Joseph Joshua

Compression Information (CI) Anchor Row:

For the first column, the best prefix isNels The longest value beginning with the prefix isNelson,

so that’s written to the CI structure, as shown in Figure 67-2 For the second column, the best prefix

isJosand the longest value isJoseph The prefixes are written to an anchor row at the beginning of

each page

Trang 8

The values are then updated with the prefix (see Figure 67-2) The first value,Nielsen, begins with

one letter of the prefix, so1ielsenis written, which doesn’t save any space But the compression

ratio is much better for values that include more of the prefix — for instance,Nelsonis compressed

into just the number6because it contains six characters of the prefix with nothing remaining

Nelsenis compressed into4en, meaning that it begins with four letters of the compression followed

byen

FIGURE 67-2

Prefix compression identifies the best prefix for each column and then stores the prefix character

count in each row instead of the prefix characters

Prefix Compressed Data:

Page Header

lielsen 6 Nelson

4en 6

0Paul 2e 6 Joseph

3hua

Compression Information (CI) Anchor Row:

As demonstrated, depending on the commonality of the data set, prefix compression can significantly

compress the data without any loss of data fidelity In this simple example, prefix compression alone

reduced the data from 42 bytes to 29 bytes, saving 30%

Notice that in this example, one value,Paul, doesn’t match the prefix at all It’s stored as0Paul,

which increases the length If this is the case for most of the rows, and prefix compression offers

no benefit for a given column, the storage engine will leave the anchor row null and not use prefix

compression for that column This is one reason why sometimes tables will actually grow when

compressed

Trang 9

Once the data is prefix compressed, the storage engine applies dictionary compression Every value is

scanned and any common values are replaced with a token that is stored in the compression information

area of the page Prefix compression occurs on the column level, while dictionary compression occurs

across all columns on the page level

Compression sequence

The cool thing about data compression is that it’s completely handled by the storage engine and

trans-parent to every process outside of the storage engine This means that the data is compressed on the

disk and is still compressed when it’s read into memory The storage engine decompresses the data as

it’s being passed from the storage engine to the query processor, as illustrated in Figure 67-3

FIGURE 67-3

The storage engine compresses and decompresses data as it’s written to and read from the buffer

Relational Database Engine

Disk or SAN

Query Optimizer Query Processor

Storage Engine (buffer)

Data Compression

If the object is row compressed, or page compressed (which automatically includes row compression),

then row compression is always enabled for every page of the object Page compression, however, is a

different story:

■ The storage engine enables page compression on a page-by-page basis when there’s a benefit for that page When the storage engine creates a new page, it’s initially uncompressed and remains uncompressed as rows are added to the page Why compress a page that’s only half full anyway?

■ When the page is full but SQL Server wants to add another row to it, the storage engine tests the page for compression If the page compresses enough to add the new rows, then the page

is compressed

■ Once the page is a compressed page, any new rows will be inserted compressed (but they won’t trigger recalculation of the compression information, the prefix anchor row, or the dictionary tokens)

Trang 10

■ Pages might be recompressed (and the prefixes and dictionary tokens recalculated) when the

row is updated, based on an algorithm that factors in the number of updates to a page,

the number of rows on the page, the average row length, and the amount of space that can be

saved by page compression for each page, or when the row would again need to be split

■ Heaps are recompressed only by an index rebuild or bulk load

■ In the case of a page split, both pages inherit the page compression information (compression

status, prefixes, and dictionary tokens) of the old page

■ During an index rebuild of an object with page compression, the point at which the page is

considered full still considers the fill factor setting, so the free space is still guaranteed

■ Row inserts, updates, and deletes are normally written to the transaction log in row

com-pression, but not in page compression format An exception is when page splits are logged

Because they are a physical operation, only the page compression values are logged

Applying Data Compression

Although data compression is complicated, actually enabling data compression is a straightforward task

using either the Data Compression Wizard or anALTERcommand

Determining the current compression setting

When working with compression, the first task is to confirm the current compression setting Using the

Management Studio UI, there are two ways to view the compression type for any single object:

■ The Table Properties or Index Properties Storage page displays the compression settings as a

read-only value

■ The Data Compression Wizard, found in Object Explorer (context menu ➪ Storage ➪ Manage

Compression), opens with the current compression selected

To see the current compression setting for every object in the database, run this query:

SELECT O.object_id, S.name AS [schema], O.name AS [Object],

I.index_id AS Ix_id, I.name AS IxName, I.type_desc AS IxType,

P.partition_number AS P_No, P.data_compression_desc AS Compression

FROM sys.schemas AS S

JOIN sys.objects AS O

ON S.schema_id = O.schema_id

JOIN sys.indexes AS I

ON O.object_id = I.object_id

JOIN sys.partitions AS P

ON I.object_id = P.object_id

AND I.index_id = P.index_id WHERE O.TYPE = ‘U’

ORDER BY S.name, O.name, I.index_id ;

Định dạng
Số trang	10
Dung lượng	1,09 MB