The Real MTCS SQL Server 2008 Exam 70/432 Prep Kit- P71 pdf

Use the following T-SQL statement to load the data from the data file into the new table: BULK INSERT AdventureWorks2008.Person.PersonCopy FROM 'C:\bcp\Person.tsv' WITH DATAFILETYPE='wid

Trang 1

After running the statement, however, you get a simple report back from SQL Server:

(19972 row(s) affected)

Of course, you could use the same format files (either traditional, or XML) that

we discussed earlier So as you can see the BULK INSERT statement is very similar

in functionality to the BCP command line utility From the previous two sections you should have a pretty good idea about the mechanics of bulk inserting data You may be wondering what all the parameters we haven’t discussed are for Mostly, they have to do with performance In the next two sections, well discuss a few pointers on maximizing the performance of your bulk loads We’ll start by looking at how the transaction log is used during bulk operations But first, get your hands dirty and try a BULK INSERT

EXERCISE 8.2

Using BULK iNSERT

In this exercise, you will export and import the data file that you created previously in Exercise 8.1 back into SQL Server This exercise assumes that you have administrative privileges on the SQL Server instance you are working with, that you have the AdventureWorks2008 sample database installed on your SQL Server instance, and that you are running the exercise from the same computer where the SQL Server instance is installed.

1 Launch SQL Server Management Studio and open a new query window in the AdventureWorks2008 database.

2 Create the target table by running the following T-SQL statement:

SELECT TOP 0 * INTO AdventureWorks2008.Person.PersonCopy FROM AdventureWorks2008.Person.Person;

3 Use the following T-SQL statement to load the data from the data file into the new table:

BULK INSERT AdventureWorks2008.Person.PersonCopy FROM 'C:\bcp\Person.tsv'

WITH (DATAFILETYPE='widechar');

4 Run the following query to view the imported data:

SELECT * FROM AdventureWorks2008.Person.PersonCopy;

Trang 2

Recovery Model and Bulk Operations

Every SQL Server database has an option that determines its recovery model

The recovery model of the database determines how the transaction log can be used for backups, and how much detail is recorded in the live log for bulk operations

A database’s recovery model can be set to FULL, BULK_LOGGED, or SIMPLE

The FULL recovery model specifies that all transactions, including bulk

opera-tions, will be fully logged in the transaction log The problem with having the

FULL recovery model turned on when you are doing bulk operations is that every

record that is inserted gets completely logged in the databases transaction log If you

are loading several records, you might end up with a problem It can fill the

data-bases transaction log up, and the logging activity itself can slow down the bulk

operation The FULL recovery model does make it possible to do point-in-time

restores, even partway through a bulk operation, using the transaction log in the

event of a failure

The BULK_LOGGED recovery model records all regular transactions fully just

liked the FULL recovery model Bulk operations are minimally logged, however

What does that mean? Rather than recording the details of every row that was

written, the transaction log tracks only which data pages and extents were modified

by the bulk operation The upside is that you don’t bloat the log with a large number

of inserts, and because less I/O is being performed against the log, performance

can increase The downside is that the transaction log alone no longer has all the

information required to recover the database to a consistent state

When you back up the transaction log that contains information about bulk

operations, the actual data extents that were modified by the bulk operation are

included in the log backup That sounds weird, but it’s true The log backup actually

contains extents from the data files, thereby making it possible to restore the

transac-tion log backup and get all the data that the bulk operatransac-tion inserted back as well

You should also note that the live log can remain small (because it doesn’t have to

log every insert performed as part of the bulk load), but the log backup will be large because the log backup contains the actual database extents that were modified

However, when you are using the BULK_LOGGED recovery model, there is

some exposure to loss If a catastrophic failure were to occur after the bulk operation

completed, but before you had a chance to back up the log, or the database, you

could lose the data that was loaded This implies that when you are using the

BULK_LOGGED recovery model, you must perform at least a transaction log

backup of the database immediately after the bulk operation completes A transaction log backup is enough, but it doesn’t hurt to do full or differential database backups

as well

Trang 3

Regardless of whether you are using the FULL or BULK_LOGGED recovery model, SQL Server will keep all entries in the transaction log until they are backed

up using a BACKUP LOG statement, thereby ensuring that you can back up a contiguous chain of all transactions that have occurred on your database and that you can then restore the database using the transaction log backups This is true even with the BULK_LOGGED recovery model, as long as you back up the log immediately after a bulk operation occurs

The SIMPLE recovery model is not typically recommended for production databases The big reason is that SQL Server can clear entries from the log, even though they may not have been backed up yet However, as far as how the log works with bulk operations, SIMPLE is the same as BULK_LOGGED After a bulk operation is performed, however, you have no choice of doing a log backup You must follow up with a full or differential database backup

So what recovery model should you be using? SIMPLE isn’t a viable option for critical production databases because it doesn’t allow you to back up the transaction log FULL is the best option in terms of recoverability because it allows you to back up the log, and the log contains all the details BULK_LOGGED, however, can offer performance and maintenance benefits when doing bulk operations The answer then is really a mixture of FULL and BULK_LOGGED It is generally recommended that you leave your production databases with a FULL recovery model When doing a bulk operation you would first run a statement to change the recovery model to BULK_LOGGED, do the bulk load, run another statement to change the recovery model back to FULL, and then back up the transaction log

A couple of other requirements must be met for minimal logging to occur Minimal logging requires that the target table not be replicated and that a

TABLOCK be placed on the table by the bulk operation It also requires that the target table not have any indices on it, unless the table is empty If the table already has data in it and it has one or more indices, it may be better to drop the indices before the load, and then rebuild them after Of course, this should be tested in your own environment

The following sample code shows an example of a minimally logged

BULK INSERT:

ALTER DATABASE AdventureWorks2008 SET RECOVERY BULK_LOGGED;

BULK INSERT AdventureWorks2008.Person.PersonCopy

FROM 'C:\bcp\Person.tsv' WITH (DATAFILETYPE='widechar', TABLOCK);

ALTER DATABASE AdventureWorks2008 SET RECOVERY FULL;

BACKUP LOG AdventureWorks2008 To DISK='C:\…\SomeFile.bak'

Trang 4

Note that the preceding code is only a sample The AdventureWorks2008

database actually uses the SIMPLE recovery model by default Although the code

shown in this example would work, it assumes that the full database backup has

already been performed Log backups can’t be run unless a full backup has been

performed If you do try the preceding code, you might want to set the recover

model back to SIMPLE when you are done

Using the right recovery model and bcp options to enable minimal logging can

help improve performance by not writing as much detail to the live transaction log

for a database These steps reduce the amount of work the hard drives must do and

can accelerate the performance of your bulk loading It can also make the load

more manageable by not bloating the transaction log with a large amount of data

This bloat alone could actually cause a bulk load to fail if the log filled to capacity

Figure 8.1 shows a performance monitor chart of the Percent Log Used counter for the AdventureWorks2008 database The chart shows the log utilization for two bulk

loads The first load was not minimally logged The second load was You can see the dramatic difference in performance between the two modes

Trang 5

There are other ways to optimize performance, though In the next section we will cover some ways to optimize the performance of bulk load operations

Optimizing Bulk Load Performance

The whole point of performing bulk loads is performance Well, performance and convenience, but performance is probably the critical part You want to get as much data into the server as fast as you can, and with as little impact on the server as possible As we discussed in the previous topic, configuring your bulk loads to be minimally logged can significantly improve the performance and decrease the negative impacts of bulk loads However, you have other options that you can use

to help manage bulk loads as well as improve their performance These options include breaking the data into multiple batches, and presorting the data to match the clustered index on the target table

Both BCP and BULK INSERT support breaking the load of large files down into smaller batches The default behavior is that a single batch is used Each

batch equates to a transaction Therefore, the default is that the bulk operation is performed as a single transaction One big problem with this option is that the entire load succeeds, or the entire load fails It also means that the transaction log information that is maintained for the bulk load can’t be cleared from the log until the bulk operation completes

You can optimize the loading of your bulk data by breaking it down into smaller batches This allows you to fail only the batch rather than the whole load

if an error occurs When you restart the process, you could restart (using the first row options) with the specific batch It also allows the log to be cleared if backup operations run during the bulk load time frame Finally, it allows you to break a larger data file into pieces and have it be run by multiple clients in parallel

Of course, if you didn’t have a performance problem to start with, using batches can actually make things worse So you really need to test with the options to find the optimal settings for your situation

You can also help improve the performance of your bulk loads by making sure that the data in the data file is sorted by the same order as the clustered index key on the target table If you know this is the case, you can specify to the bulk operation that the data is presorted using the ORDER hint of the BCP utility or BULK INSERT statement This can improve the performance of bulk loads to tables with clustered indexes

In addition, it may be beneficial to drop nonclustered indices on the table before the load, and re-create them after the load If the table is empty to start with, this may not help, but if the table has data in it before the load, then it could provide a performance improvement Of course you should test this with your own databases

Định dạng
Số trang	5
Dung lượng	173,48 KB