With all monitoring systems a go, I am ready to load up a heap table called book_list that I created in the All_Books_Ever_Read database.. The Books-List.txt file has approximately 58 th
Trang 1Figure 4.5: Physical Disk performance object in Perfmon
With all monitoring systems a go, I am ready to load up a heap table called
book_list that I created in the All_Books_Ever_Read database The Books-List.txt file has approximately 58 thousand records, so I'm going to use the BCP
batch file technique (see Listing 3.3, in Chapter 3) to iterate through the file 50 times, and load 2.9 million records into the database Now it is time to begin the load A quick peek at Perfmon, see Figure 4.6, shows the current absence of activity prior to executing a hefty query
Figure 4.6: Perfmon low disk activity
Executing Load … now! Please don't turn (or create) the next page …!!
Sorry! I could not resist the Sesame Street reference to The Monster at the End of This Book In fact, the load proceeds with little fanfare Imagine this is being done
in the middle of the afternoon, perhaps after a big lunch or, worse, early in the
AM (DBA:M most likely) before your second sip of coffee, with you blissfully unaware of what's unfolding on one of your servers Figure 4.7 shows the BCP bulk insert process running
Trang 2Figure 4.7: BCPing data into the All_Books_Ever_Read database
You can see that the batch process ran 50 times at an average of 2.5 seconds a run, with a total load time of roughly 2 minutes Not bad for 2.9 million records Now for the bad news: Figure 4.8 shows how much growth can be directly attributed to the load process
Figure 4.8: Log file growth loading millions of records into table
NOTE
For comparison, in a test I ran without ever having backed up the database, the data file grew to over 3 GB, but the log file grew only to 150 MB
Trang 3Both the data file and the log file have grown to over 3GB The Profiler trace, as shown in Figure 4.9, reveals that a combined total of 3291 Auto Grow events took place during this data load Notice also that the duration of these events, when combined, is not negligible
Figure 4.9: Data and log file growth captured with Profiler
Finally, Figure 4.10 shows the Perfmon output during load As you can see, % Disk Time obviously took a hit at 44.192 % This is not horrible in and of itself;
obviously I/O processes require disk reads and writes and, because "Avg Disk Queue Length" is healthily under 3, it means the disk is able to keep up with the demands However, if the disk being monitored has a %DiskTime of 80%, or more, coupled with a higher (>20) Avg Disk Queue Length, then there will be performance degradation because the disk can not meet the demand Inefficient queries or file growth may be the culprits
Figure 4.10: Perfmon disk monitor
Trang 4Average and Current Disk Queue Lengths are indicators of whether or not bottlenecks might exist in the disk subsystem In this case, an Average Disk Queue Length of 1.768 is not intolerably high and indicates that, on average, fewer than 2 requests were queued, waiting for I/O processes, either read or write, to complete
on the Disk
What this also tells me is that loading 2.9 million records into a heap table, batching or committing every 50,000 records, and using the defaults of the Model database, is going to cause significant I/O lag, resulting not just from loading the data, but also from the need to grow the data and log files a few thousand times Furthermore, with so much activity, the database is susceptible to unabated log file growth, unless you perform regular log backups to remove inactive log entries from the log file Many standard maintenance procedures implement full backups for newly created databases, but not all databases receive transaction log backups This could come up to bite you, like the monster at the end of this chapter, if you forget to change the recovery model from Full to Simple, or if you restore a database from another system and unwittingly leave the database in Full recovery mode
Appropriately sizing your data and log files
Having seen the dramatic impact of such bulk load operations on file size, what I really want to know now is how much I could reduce the I/O load, and therefore increase the speed of the load process, if the engine hadn't had to grow the files
3291 times, in 1 MB increments for the data file, and 10% increments for the log file
In order to find out, I need to repeat the load process, but with the data and log files already appropriately sized to handle it I can achieve this by simply truncating the table and backing up the transaction log This will not shrink the physical data
or log files but it will free up all of the space inside them Before I do that, take a look at the sort of space allocation information that is provided by the
sp_spaceused built-in stored procedure in Figure 4.11
Figure 4.11: Output of sp_spaceused for the loaded Book_List table
Trang 5As you can see, the Book_List table is using all 3.3 GB of the space allocated to the database for the 2.9 million records Now simply issue the TRUNCATE
command
Truncate Table Book_List
And then rerun sp_spaceused The results are shown in Figure 4.12
Figure 4.12: sp_spaceused after truncation
You can verify that the data file, although now "empty", is still 3.3GB in size using
the Shrink File task in the SSMS GUI Right click on the database, and select
"Tasks |Shrink | Files" You can see in Figure 4.13 that the
All_Books_Ever_Read.mdf file is still 3.3 GB in size but has 99% available free
space
What this means to me as a DBA, knowing I am going to load the same 2.9 million records, is that I do not expect that the data file will grow again Figure 4.14 shows the command window after re-running the BCP bulk insert process, superimposed on the resulting Profiler trace