In this chapter, I will describe how I use monitoring tools and techniques to make sure that my Blackberry will always buzz whenever a backup fails, a disk drive fills up, or a rogue pr
Trang 1CHAPTER 6: MONITORING AND
NOTIFICATIONS
As is probably clear by this stage, there are many potential monsters, lurking around corners, waiting to pounce upon the unwary DBA as he goes about his day-to-day duties Often, however, the biggest problem is not the monster itself but the fact that the DBA is unaware that it exists
Imagine a problem as trivial as a SQL Agent Service that fails to start; a very easy problem to fix once you know about it But what if you don't know about it and then suddenly find out that the backup process that this service was supposed to
be running has not been executed for over two weeks! The feeling at this moment for a DBA, or DBA manager, is one of frustration and disbelief These emotions are quickly displaced however, perhaps after a few minutes alone with the warm blankie and a soft floor, by an unswerving confidence This confidence derives from that fact that you know that positive steps will be taken to ensure that this never happens again
In this chapter, I will describe how I use monitoring tools and techniques to make
sure that my Blackberry will always buzz whenever a backup fails, a disk drive fills
up, or a rogue process is threatening the performance of a SQL Server
When the inevitable happens, and the e-mail notification hits your mobile device, probably at some awful hour of the morning, I'll show what you can do to easily ascertain the problem and be notified, using a mix of third party tools, such as Red Gate's SQL Response, and standard tools like Database Mail
Types of monitoring and notifications
The DBA's life is one of vigilantly overseeing not only the SQL Servers themselves, but all of the events that take place on the servers When I say events, I am not specifically referring to error events that cause entries to be added
to the Windows Event log or SQL Server Error log, though these are certainly included in an overall monitoring and notification strategy Here, I am referring to events such as SQL Server Agent job failure, or an abnormal SQL Server performance condition, or excessive resource (e.g disk space) utilization, or SQL Services availability
It is not possible for a DBA team, no matter how large, to keep this vigil by themselves They need automated notifications that will let them know when
Trang 2something goes awry, so that they can respond to the event and resolve any issues arising from it There are many ways that DBAs can set up such notifications, either using native SQL Server tools, such as Database Mail or SQL Agent Mail (two separate entities), or a third party monitoring and notification system There are quite a number of such applications on the market
In my career, I have generally employed a hybrid solution of both native and third-party monitoring, because of the benefits and limitations of each For example, a third-party application may not be able to retrieve the internal error messages of a failed SQL Agent database backup job Conversely, it would be impractical to set up performance monitoring, say for a sustained level of high CPU utilization, for each instance of a 200-strong SQL Server farm, when a centralized third party solution could easily be maintained
In this chapter, I will expound upon the types of events that would require such a monitoring and notification system to be in place and then give examples of solutions for each
SQL Agent Job failures
We all know that failures occur and that the reasons for the failures are many-fold SQL Agent Jobs, which kick off SSIS packages or maintenance tasks, such as database backups or integrity checks, are common points of failure It is your job
as DBA to respond to the failure, overcome it, and finally to understand why the failure occurred in the first place and make sure it does not happen again
In this chapter, I will cover only one such type of failure, but it is one close to the DBA's heart and that is database backup failure Regardless of whether you run your database and transaction log backup jobs via a third party tool or natively, perhaps using a Database Maintenance Plan, they are scheduled processes and if there is a failure you must be made aware of it I happen to use Red Gate's SQL Backup utility (not a plug, just reality), which has built-in notifications for backup failures However, being a duly cautious DBA, I do not rely on this notification mechanism being fail proof I will show how to setup notifications for failed backup jobs at several different points, so that if one notification fails, others will not
Adverse performance conditions
SQL Server is adept at managing memory, I/O requests and multi-threading across multiple CPUs Occasionally, however, a "rogue" query will push the SQL Server to its very limits and it becomes unresponsive, as if its feelings have been hurt by the indignity of it all
Trang 3By the time we enter the unresponsive phase, it may be too late to glean what nefarious query it was that caused the problem in the first place What is needed,
in order to ensure a DBA's restful sleep, is an application that can monitor the server, using a time-based algorithm, and fire a notification when a specific threshold is crossed For example, we may request a notification if CPU utilization exceed 85% for a set number of minutes
There are many such third-party applications that will handle this very well, Idera Diagnostic Manager, Argent Guardian and Microsoft Operations Manager (MOM) are a few that come to mind I will show how to use one such application, Red Gate's SQL Response, to trigger these performance alerts
Further, once notified of the performance issue at hand, I will demonstrate how to use two indispensable tools, Performance Monitor and SQL Profiler, to quickly and easily analyze and resolve the problem
Service availability
It should go without saying that a SQL service, such as SQL Server service (the database engine) or SQL Server Agent service, stopping is an event to which the sleepy DBA should be notified (I make it sound like these alerts always happen at night; they don't It's just the ones that do tend to stay with you)
So, how does SQL Server notify you that it is down? It doesn't It is down, and so cannot send a notification This, again, would be the work of a third party monitoring application Such monitoring solutions should also have the ability to take corrective action when encountering a stopped service, such as trying to restart the service
I'll show how to use a third party tool, such as SQL Response, to monitor these services, and also how to configure the SQL services to have some resilience when they stop unexpectedly
Disk space shortage
Chapter 4 discussed how to manage data growth, and the space issues that can be caused by bulk loads, errant indexing, abuse of TempDB, and so on I can state unequivocally that disk space alerts are the most common type of alert that the DBA will face
Disk space is not cheap, despite what you might be told by some IT managers SCSI and fiber channel drives, which are the core storage devices for most SANs, are still quite expensive on a "per Meg" basis, compared to the slower SATA or ATA drives that are commonly used in development, staging or QA SQL Server installations
Trang 4For production servers that require many hundreds of Gigs of storage, it is essential that you analyze growth trends to make sure you will not be caught naked
in the front yard when the application crashes because there is no more space (naked in the front yard? Forget that analogy) The bottom (ugg) line is that you need to be alerted not when you are completely out of disk space but when you have a specified percentage of remaining space, so that there is still time to act In this chapter I will cover how to be alerted to uncontrolled data growth
Enabling notifications
Enabling notifications in SQL Server is a straightforward process that simply entails setting up a mechanism by which to send the notification emails, and then defining who it is who should receive them
Setting up database mail
Database mail is an essential first component in enabling the delivery of notifications in SQL Server, and its set up is included in the Automated Configuration Script in Chapter 1
While there are other options for being notified of events, such as by Pager or Net Send, they are not really viable in today's world of mobile devices Thankfully, SQL Server 2005 and beyond offers an SMTP mail client for both SQL Server (database engine) and the SQL Agent (job scheduler)
If you have to set up mail for SQL 2000, which a lot of people still do due to the cost of upgrading and/or lack vendor support (yes, this still happens, even in 2009
as I write this very sentence) then my heart goes out to you Having to install and test a MAPI client like Outlook just to send mail from SQL 2000 is beyond frustrating
NOTE
A website still exists to assist you in your SQL Server 2000 mail woes and that
is http://www.sqldev.net/xp/xpsmtp.htm It offers an SMTP mail client for SQL Server 2000 It does not, sadly, address the SQL Agent mail, but it is worth a look
Setting up Database Mail in SQL Server 2005 or 2008 is very straightforward You just need to configure:
• The default profile that will be used to send mail
• An SMTP server address
• An account from which the mail will be sent
Trang 5Figure 6.1 shows the profile information from the Database Mail Configuration
Wizard, launched by double-clicking Database Mail under the Management tab
in SQL Server Management Studio
Figure 6.1: Profile for Database Mail in SQL Server 2005
Notice that the Profile name is "Notifications" and it is associated with the Account called "Standard SQL Mail Account" It is this profile and account association that allows Database Mail to send true SMTP mail, using a standard email stored procedure, sp_send_dbmail
The account information, which stores the SMTP server address, is set up separately from the profile The account properties, which are directly associated with a profile, can be seen in Figure 6.2