As shown in Figure 27.1, SQL Server replicationworks much the same in that it too has a publisher, a distributor, and a subscriber: Publisher: In SQL Server terminology, the publisher is
Trang 112 Click Next
13 Under Select Tables to Tune, click Select All Tables.
14 Click Next; the Wizard will now start tuning your indexes.
15 You will be asked to accept the index recommendations; click Next.
Trang 216 If there were recommendations, you would be asked to schedule them for later or
run them now, but because there are no recommendations for this workload file,you are taken directly to the final screen Click Finish to complete the Wizard
17 When you receive a message stating that the Wizard has completed, click OK.
18 Exit Profiler.
Tips and Techniques
If you want the best results from SQL Server’s monitoring tools, you need to knowand use the proper techniques If you don’t, the end result will not be what you arehoping for—or what you need
Setting a Measurement Baseline
You will never know if your system is running slower than normal unless you know
what normal is, which is what a measurement baseline does: It shows you the resources
(memory, CPU, etc.) SQL Server consumes under normal circumstances You createthe measurement baseline before putting your system into production so that youhave something to compare your readings to later on
P A R T
VI
Trang 3The first thing you need to create an accurate measurement baseline is a test work with just your SQL Server and one or two client machines You limit the number
net-of machines involved because all networks have broadcast traffic, which is processed
by all the machines on the network This broadcast traffic can throw your countsoff—sometimes a little, sometimes quite a bit You may instead want to consider shut-ting down as many machines as possible and generating your baseline off-hours ifyour budget does not allow for a test network
You can then start your baseline The Windows NT counters mentioned at the set of this chapter as well as the preset SQL Server counters should provide an accu-rate baseline with which you can compare future readings Then you can move to thenext technique
out-Data Archiving and Trend Tracking
Although the consequences of throwing away your SQL Server monitoring records arenot quite as severe as facing an IRS auditor without records and receipts, you still
need to save, or archive, your records One of the primary reasons to do so is to back
up requests for additional equipment For example, if you ask for funds to buy morememory for the SQL Server, but don’t bring any proof that the system needs the RAM,you are probably not going to get the money If you bring a few months’ worth ofreports, however, and say, “After tracking SQL Server for a time, we’ve found this…”management may be far more willing to give you the money you need Using
archived data in such fashion is known as trend tracking.
One of the most valuable functions of using your archived data for trend tracking
is proactive troubleshooting—that is, anticipating and avoiding problems before theyarise Suppose you added 50 new users to your network about three months ago andare about to do it again If you archived your data from that period, you would beable to recall what those 50 users did to the performance of the SQL Server, and youcould compensate for it On the other hand, if you threw that data away, you might
be in for a nasty surprise when your system unexpectedly slows to a crawl
Optimization Techniques
SQL Server can dynamically adjust most of its settings to compensate for problems Itcan adjust memory use, threads spawned, and a host of other settings In some cases,unfortunately, those dynamic adjustments may not be enough—you may need tomake some manual changes
We’ll look at a few specific areas that may require your personal attention
Trang 4Queries and Stored Procedures
The first thing to ask yourself when you are getting slow response times is whetheryou could be using a stored procedure instead of a local query Stored procedures aredifferent from local code in two ways: They are stored on the SQL Server, so they donot need to be transmitted over the network, which causes congestion In addition,stored procedures are precompiled on the server; this saves system resources, becauselocal code must be compiled once it gets to the system
Overall, stored procedures are the way to go, but if you need to use local queries,you should consider how they are written, because poorly constructed queries canwreak havoc on your system If, for example, you have a query that is returning everyrow of a table when only half of that is required, you should consider rewriting thequery Improper use of WHERE clauses can also slow your queries down Make surethat your WHERE clauses reference indexed columns for optimal performance
Tempdb
Is your tempdb big enough to handle the load that your queries put on it? Think oftempdb as a scratchpad for SQL Server; when queries are performed, SQL Server usesthis scratchpad to make notes about the result set If tempdb runs out of room tomake these notes, system response time can slow down Tempdb should be between
25 and 40% of the size of your largest database (for example, if your largest database is100MB, tempdb should be 25 to 40MB)
Query Governor
Right out of the box, SQL Server will run any query you tell it to, even if that query ispoorly written You can change that by using the Query Governor This is not a sepa-rate tool, but is part of the database engine and is controlled by the Query Governor
Cost Limit This setting tells SQL Server not to run queries longer than x (where x is a
value higher than zero) If, for example, the Query Governor Cost Limit is set to 2, anyquery that is estimated to take longer than 2 seconds would not be allowed to run SQLServer can estimate the running time of a query because SQL Server keeps statisticsabout the number and composition of records in tables and indexes The Query Gover-nor Cost Limit can be set by using the command sp_configure ‘query governorcost limit’, ‘1’(the 1 in this code can be higher) The Cost Limit can also be set onthe Server Settings tab of the Server Properties page in Enterprise Manager
P A R T
VI
Trang 5NOTE If the Query Governor Cost Limit is set to zero (the default), all queries will beallowed to run.
Setting Trace Flags
A trace flag is used to temporarily alter a particular SQL Server behavior Much like a
light switch can be used to turn off a light and then turn it back on again, a trace flagcan be used to turn off (or on) a behavior in SQL Server Trace flags are enabled withDBCC TRACEON and turned off with DBCC TRACEOFF The command to enabletrace flag 1204 would look like this: DBCC TRACEON(1204) Table 26.3 lists some ofthe trace flags available to you
TABLE 26.3: USES OF TRACE FLAGS Trace Flag Use
107 This instructs the server to interpret numbers with a decimal point as type
float instead of decimal
260 This trace flag prints version information for extended stored procedure
Dynamic Link Libraries If you write your own extended stored procedures,this trace flag will prove useful in troubleshooting
1204 This will tell you what type of locks are involved in a deadlock and what
com-mands are affected
1205 This flag returns even more detailed information about the commands
affected by a deadlock
1704 This will print information when temporary tables are created or dropped
2528 This trace flag disables parallel checking of objects by the DBCC CHECKDB,
DBCC CHECKFILEGROUP, and DBCC CHECKTABLE commands If you knowthat the server load is going to increase while these commands are running,you may want to turn these trace flags on so that SQL Server checks only asingle object at a time and therefore places less load on the server Underordinary circumstances, though, you should let SQL Server decide on thedegree of parallelism
3205 This will turn off hardware compression for backups to tape drives
3604 When turning on or off trace flags, this flag will send output to the client
3605 When turning on or off trace flags, this flag will send output to the error log
7505 This enables 6.xhandling of return codes when a call to dbcursorfetchx
causes the cursor position to follow the end of the cursor set
Trang 6Max Async I/O
It should go without saying that SQL Server needs to be able to write to disk, becausethat’s where the database files are stored—but is it writing to disk fast enough? If youhave multiple hard disks connected to a single controller, multiple hard disks con-nected to multiple controllers, or a RAID system involving striping, the answer isprobably no The maximum number of asynchronous input/output (Max Async I/O)threads by default in SQL Server is 32 This means that SQL Server can have 32 out-standing read and 32 outstanding write requests at a time Thus, if SQL Server needs
to write some data to disk, SQL Server can send up to 32 small chunks of that data todisk at a time If you have a powerful disk subsystem, you will want to increase theMax Async I/O setting
The value to which you increase this setting depends on your hardware, so if youincrease the setting, you must then monitor the server Specifically, you will need tomonitor the Physical Disk: Average Disk Queue Performance Monitor counter, whichshould be less than two (note that any queue should be less than two) If you adjustMax Async I/O and the Average Disk Queue counter goes above two, you have set MaxAsync I/O too high and will need to decrease it
NOTE You will need to divide the Average Disk Queue counter by the number of ical drives to get an accurate count That is, if you have three hard disks and a countervalue of six, you would divide six by three—which tells you that the counter value for eachdisk is two
phys-LazyWriter
LazyWriter is a SQL Server process that moves information from the data cache inmemory to a file on disk If LazyWriter can’t keep enough free space in the datacache for new requests, performance slows down To make sure this does not hap-pen, monitor the SQL Server: Buffer Manager – Free Buffers Performance Monitorcounter LazyWriter tries to keep this counter level above zero; if it dips or hits zero,you have a problem, probably with your disk subsystem To verify this, you need tocheck the Physical Disk: Average Disk Queue Performance Monitor counter and ver-ify that it is not more than two per physical disk (see above) If the queue is too high,LazyWriter will not be able to move data efficiently from memory to disk, and thefree buffers will drop
P A R T
VI
Trang 7RAID (Redundant Array of Inexpensive Disks) is used to protect your data and speed
up your system In a system without RAID, data that is written to disk is written tothat one disk In a system with RAID, that same data would be written across multipledisks, providing fault tolerance and improved I/O Some forms of RAID can be imple-mented inexpensively in Windows NT, but this uses such system resources as proces-sor and memory If you have the budget for it, you might consider getting a separateRAID controller that will take the processing burden off Windows NT RAID is dis-cussed in detail in Chapter 4, but here is a quick refresher:
RAID 0 Stripe Set: This provides I/O improvement, but not fault tolerance
RAID 1 Mirroring: This provides fault tolerance and read-time ment This can also be implemented as duplexing, which is a mirror that hasseparate controllers for each disk
improve-RAID 0+1 Mirrored Stripe Set: This is a stripe set without parity that isduplicated on another set of disks This requires a third-party controller,because Windows NT does not support this level of RAID natively
RAID 5 Stripe Set with Parity: This provides fault tolerance andimproved I/O
Adding Memory
SQL Server, like most BackOffice products, needs significant amounts of RAM Themore you put in, the happier SQL Server will be There is one caveat about addingRAM, however: your level 2 cache This is much faster (and more expensive) thanstandard RAM and is used by the processor for storing frequently used data If youdon’t have enough level 2 cache to support the amount of RAM in your system, yourserver may slow down rather than speed up Microsoft tells you that the minimumamount of RAM that SQL Server needs is 32 to 64MB, but because SQL Server benefitsgreatly from added RAM, you should consider using 256MB of RAM, which requires1MB of level 2 cache
Manually Configuring Memory Use
Although SQL Server can dynamically assign itself memory, it is not always best to let
it do so A good example of this is when you need to run another BackOffice program,such as Exchange, on the same system as SQL Server If SQL Server is not constrained,
it will take so much memory that there will be none left for Exchange The relevant
Trang 8constraint is the max server memory setting; by adjusting it, you can stop SQL Server
from taking too much RAM If, for example, you set it to 102,400—100 ×1024 (thesize of a megabyte)—SQL Server will never use more than 100MB of RAM
You could also set min server memory, which tells SQL Server to never use less than the set amount; this should be used in conjunction with set working size Windows NT
uses virtual memory, which means that data that is in memory and has not been
accessed for a while can be stored on disk The set working size option stops Windows
NT from moving SQL Server data from RAM to disk, even if SQL Server is idle Thiscan improve SQL Server’s performance, because data will never need to be retrievedfrom disk (which is about 100 times slower than RAM) If you decide to use this
option, you should set min server memory and max server memory to the same size, and then change the set working size option to 1.
Summary
This chapter has stressed the importance of monitoring and optimization ing allows you to find potential problems before your users find them; without it, youhave no way of knowing how well your system is performing
Monitor-Performance Monitor can be used to monitor both Windows NT and SQL Server
Some of the more important counters to watch are Physical Disk: Average Disk Queue(which should be less than two) and SQLServer:Buffer Manager: Buffer Cache HitRatio (which should be as high as possible)
Query Analyzer allows you to see how a query will affect your system before youplace the query in production The Profiler is used to monitor queries after they havebeen placed in general use; it is also useful for monitoring security and user activity
Once you have used Profiler to log information about query use to a trace file, youcan run the Index Tuning Wizard to optimize your indexes
Once you have created all logs and traces, you need to archive them The variouslog files can be used later for budget justification and trend tracking For example,suppose you added 50 users to your system six months ago and are about to add 50more If you kept records on what kind of load the last 50 users placed on your sys-tem, you will be better prepared for the next 50
This chapter also presented some tips for repairing a slow-running system You canchange the Max Async I/O setting if your disk is not working hard enough to supportthe rest of the system, and you may need to upgrade your disk subsystem if the SQLServer: Buffer Manager – Free Buffers Performance Monitor counter hits zero RAIDcan also speed up your SQL Server If you can afford a separate controller, you should
P A R T
VI
Trang 9get one to take some of the burden off Windows NT If you can’t afford one, you canuse Windows NT RAID level 1 for fault tolerance and speed.
Now that you know how to optimize your server and keep it running at peak formance, it will be much easier to perform all of the tasks on your SQL Server This isespecially true of the next topic that we will discuss—replication
Trang 11F or one reason or another, many companies have more than one database
system, especially in larger companies where there is more than one location
or multiple departments keep their own servers Regardless of the reason,many of these servers need to have copies of each other’s databases Forexample, if you have two servers for your human resources department (one in NewYork and one in Singapore), you may need to keep a copy of each database on eachserver so that all of your human resources personnel can see the same data The bestway to copy this data is through replication
Replication is designed specifically for the task of copying data and other objects
(such as views, stored procedures, and triggers) between servers and making certainthat those copies stay up-to-date In this chapter, we will look into the inner work-ings of replication First we will discuss some terminology that is used to describethe various parts of replication After you have an understanding of the terms, wecan discuss the roles that SQL Servers can play in the replication process Next wewill move into the types and models of replication, and finally we will replicate.Let’s get started
Understanding Replication
The sole purpose of replication is to copy data between servers There are several goodreasons for doing so:
• If your company has multiple locations, you may need to move the data closer
to the people who are using it
• If multiple people want to work on the same data at the same time, replication
is a good way of giving them that access
• Replication can separate the functions of reading from writing data This is cially true in OLTP (online transaction processing) environments where readingdata can place quite a load on the system
espe-• Some sites may have different methods and rules for handling data (perhaps thesite is a sister or child company) Replication can be used to give these sites thefreedom of setting their own rules for dealing with data
• Mobile sales users can install SQL Server 2000 on a laptop, where they mightkeep a copy of an inventory database These users can keep their local copy ofthe database current by dialing in to the network and replicating
You may be able to come up with even more reasons to use replication in yourcompany, but to do so, you need to understand the publisher/subscriber concept
Trang 12The Publisher/Subscriber Metaphor
Microsoft uses the publisher/subscriber metaphor to make replication easier to stand and implement It works a lot like a newspaper or magazine company Thenewspaper company has information that people around the city want to read; there-fore the newspaper company publishes this data and has newspaper carriers distribute
under-it to the people who have subscribed As shown in Figure 27.1, SQL Server replicationworks much the same in that it too has a publisher, a distributor, and a subscriber:
Publisher: In SQL Server terminology, the publisher is the server with theoriginal copy of the data that others need—much like the newspaper companyhas the original data that needs to be printed and distributed
Distributor: Much like the newspaper company needs paper carriers to tribute the newspaper to the people who have subscribed, SQL Servers needspecial servers called distributors to collect data from publishers to distribute
dis-to subscribers
Subscriber: A subscriber is a server that requires a copy of the data that isstored on the publisher The subscriber is akin to the people who need to readthe news, so they subscribe to the newspaper
FIGURE 27.1
SQL Server can publish, distribute,
or subscribe to publications
in replication.
The analogy goes even further: All of the information is not just lumped together
in a giant scroll and dropped on the doorstep—it is broken up into various
publica-Publication Article Article Article
Publisher
Contains originalcopy of data
Distributor
Collects changesfrom publishers
Subscriber
Receives acopy of data
P A R T
VI
Trang 13tions and articles so that it is easier to find the information you want to read SQLServer replication follows suit:
Article: An article is just data from a table that needs to be replicated Ofcourse, you probably do not need to replicate all of the data from the table, soyou don’t have to Articles can be horizontally partitioned, which means thatnot all records in the table are published, and they can be vertically parti-tioned, which means that not all columns need be published
Publication: A publication is a collection of articles and is the basis for scriptions A subscription can consist of a single article or multiple articles, butyou must subscribe to a publication as opposed to a single article
sub-Now that you know the three roles that SQL Servers can play in replication andthat data is replicated as articles that are stored in publications, you need to know thetypes of replication
Replication Types
It is important to control how publications are distributed to subscribers If the paper company does not control distribution, for example, many people may not getthe paper when they need it, or other people may get the paper for free In SQLServer, you need to control distribution of publications for similar reasons, so that thedata gets to the subscribers when it is needed There are a few factors to considerwhen choosing a replication type:
news-Autonomy: Autonomy is the amount of independence that your subscribershave over the data they receive Some servers may need a read-only copy of thedata, while others may need to be able to make changes to the data they receive
Latency: This refers to how long a subscriber can go without getting a freshcopy of data from the server Some servers may be able to go for weeks withoutgetting new data from the publisher, while other instances may require a veryshort wait time
Consistency: Possibly the most popular form of replication is transactionalreplication, where transactions are read from the transaction log of the pub-lisher, moved through the distributor, and applied to the database on the sub-scriber This is where transactional consistency comes in Some subscribers mayneed all of the transactions in the same order they were applied to the server,while other subscribers may need only some of the transactions
Once these factors have been considered, you are ready to choose the replicationtype that will work best for you
Trang 14Distributed Transactions
In some instances, multiple servers may need the same transaction at the exact sametime, as in a bank, for example Suppose that the bank has multiple servers for storingcustomer account data, each server storing a copy of the same data—all servers canmodify the data in question Now suppose that a customer comes to an AutomaticTeller Machine and withdraws money from their account The action of withdrawingmoney is a simple Transact-SQL transaction that removes money from the customer’schecking account record, but remember that more than one server holds this data Ifthe transaction makes it to only one of the bank’s servers, the customer could go toATMs all over town and withdraw enough money to retire on, and the bank wouldhave a very hard time stopping them
To avoid such a scenario, you need to get the exact same transaction to all of thesubscribers at the same time If the transaction is not applied to all of the servers, it is
not applied to any of the servers This type of replication is called distributed tions or two-phase commit (2PC) Technically this is not a form of replication; 2PC uses
transac-the Microsoft Distributed Transaction Coordinator and is controlled by transac-the way transac-theTransact-SQL is written A normal, single-server transaction looks like this:
BEGIN TRAN
TSQL CODE
COMMIT TRAN
A distributed transaction looks like this:
BEGIN DISTRIBUTED TRAN
TSQL CODE
COMMIT TRAN
Using distributed transactions will apply the same transaction to all requiredservers at once or to none of them at all This means that this type of replication hasvery low autonomy, low latency, and high consistency
Transactional
All data modifications made to a SQL Server database are considered transactions,whether or not they have an explicit BEGIN TRAN command and correspondingCOMMIT TRAN (if the BEGIN…COMMIT is not there, SQL Server assumes it) All ofthese transactions are stored in a transaction log that is associated with the database
With transactional replication, each of the transactions in the transaction log can be
replicated The transactions are marked for replication in the log (because not alltransactions may be replicated), then they are copied to the distributor, where theyare stored in the distribution database until they are copied to the subscribers
P A R T
VI
Trang 15The only real drawback is that subscribers to a transactional publication must treatthe data as read-only, meaning that users cannot make changes to the data they receive.Think of it as being like a subscription to a newspaper—if you see a typo in an ad in thepaper, you can’t change it with a pen and expect the change to do any good No oneelse can see your change, and you will just get the same typo in the paper the next day.
So, transactional replication has high consistency, low autonomy, and road latency
middle-of-the-Transactional with Updating Subscribers
This type of replication is almost exactly like transactional replication, with one majordifference: The subscribers can modify the data they receive You can think of this type
of replication as a mix of 2PC and transactional replication in that it uses the uted Transaction Coordinator and distributed transactions to accomplish its task.The publisher still marks its transactions for replication, and those transactions getstored on the distributor until they are sent to the subscriber On the subscriber,though, there is a trigger that is marked NOT FOR REPLICATION This special triggerwill watch for changes that come from users of the server, but not for changes thatcome from the distributor as a process of replication This trigger on the subscriberdatabase will watch for changes and send those changes back to the publisher, wherethey can be replicated out to any other subscribers of the publication
Distrib-Snapshot
While transactional replication copies only data changes to subscribers, snapshot
replication copies entire publications to subscribers every time it replicates Inessence, it takes a snapshot of the data and sends it to the subscriber every time itreplicates This is useful for servers that need a read-only copy of the data and do notrequire updates very often—in fact, they could wait for days or even weeks forupdated data
A good example of where to use this type of replication is in a department storechain that has a catalog database The headquarters keeps and publishes the mastercopy of the database where changes are made The subscribers can wait for updates tothis catalog for a few days if necessary
The data on the subscriber should be treated as read-only here as well, because all ofthe data is going to be overwritten anyway each time replication occurs This type ofreplication is said to have high latency, high autonomy, and high consistency
Snapshot with Updating Subscribers
The only difference between this type of replication and standard snapshot tion is that this type will allow the users to update the data on their local server This
Trang 16replica-is accomplreplica-ished the same way it replica-is accomplreplica-ished in transactional replication withupdating subscribers—a trigger is placed on the subscribing database that watches forlocal transactions and replicates those changes to the publishing server by means ofthe Distributed Transaction Coordinator This type of replication has moderatelatency, high consistency, and high autonomy.
Merge
By far, this is the most complex type of replication to work with, but also the most
flexible Merge replication allows changes to be made to the data at the publisher as
well as at all of the subscribers These changes are then replicated to all other scribers until finally your systems reach convergence, the blessed state at which all ofyour servers have the same data
sub-The biggest problem with merge replication is known as a conflict This problem
occurs when more than one user modifies the same record on their copy of the base at the same time For example, if a user in Florida modifies record 25 in a table atthe same time that a user in New York modifies record 25 in their own copy of thetable, a conflict will occur on record 25 when replication takes place, because thesame record has been modified in two different places, and therefore SQL Server hastwo values from which to choose The default method of choosing a winner in thisconflict is based on site priority (which you will see how to set later in this chapter)
data-Merge replication works by adding triggers and system tables to the databases onall of the servers involved in the replication process When a change is made at any ofthe servers, the trigger fires off and stores the modified data in one of the new systemtables, where it will reside until replication occurs This type of replication has thehighest autonomy, highest latency, and lowest transactional consistency
But how does all of this occur? What is the driving force behind replication? Let’slook at the four agents that make replication run
Replication Agents
Any of the types of subscriptions listed in the last section can be either push or pull
subscriptions A push subscription is configured and controlled at the publisher This
method of subscription is like the catalogs that you receive in the mail—the publisherdecides when you get updates because the publisher knows when changes have beenmade to the information inside the catalog The same is true of a push subscription inreplication—the publisher decides when changes will be sent to the subscribers
Pull subscriptions are more like a magazine subscription You write to the publisher
of the magazine and request a subscription—the magazine is not automatically sent to
P A R T
VI
Trang 17you Pull subscriptions work much the same in that the subscriber requests a tion from the publisher—the subscription is not sent unless the subscriber asks for it.With either method of replication, four agents are used to move the data from thepublisher to the distributor and finally to the subscriber:
subscrip-Log reader agent: This agent is used primarily in transactional replication
It reads the transaction log of the published database on the publisher and looksfor transactions that have been marked for replication When it finds such atransaction, the log reader agent copies the transaction to the distributionserver, where it is stored in the distribution database until it is moved to the sub-scribers This agent runs on the distributor in both push and pull subscriptions
Distribution agent: This agent moves data from the distributor to the scribers This agent runs on the distributor in a push subscription, but in a pullsubscription, it runs on the subscriber Therefore, if you have a large number ofsubscribers, you may want to consider using a pull subscription method tolighten the load on the distribution server
sub-Snapshot agent: Just by reading the name of this agent, you would expect
it to work with snapshot replication, but it works with all types of replication.This agent makes a copy of the publication on the publisher and either copies
it to the distributor, where it is stored in the distribution working folder (\\ distribution_server\Program Files\Microsoft SQL Server\MSSQL$(instance)\REPLDATA), or places it on removable disk (such as a CD-ROM orzip drive) until it can be copied to the subscriber With snapshot replication,this agent runs every time replication occurs; with the other types of replica-tion, this agent runs on a less frequent basis and is used to make sure that thesubscribers have a current copy of the publication, including the most up-to-date structure for the data This agent runs on the distributor in either a push
Trang 18Once you have selected the type of replication you need, you can pick the physicalmodel to go with it.
Replication Models
There are three roles that a SQL Server can play in replication: publisher, distributor,and subscriber Before you can successfully implement replication, you need to knowwhere to place these servers in your scheme Microsoft has a few standard replicationmodels that should make it easier for you to decide where to put your servers
Single Publisher, Multiple Subscribers
In this scenario, there is a single, central publishing server where the original copy ofthe data is stored and several subscribers that need copies of the data This modellends itself well to transactional or snapshot replication
A good example of when to use this is if you have a catalog database that is tained at company headquarters and your satellite offices need a copy of the catalogdatabase The database at headquarters could be published, and your satellite officeswould subscribe to the publication If you have a large number of subscribers, youcould create a pull subscription so that the load is removed from the distributionserver, making replication faster Figure 27.2 should help you visualize this concept
P A R T
VI
Trang 19Multiple Publishers, Single Subscriber
This model has a single server that subscribes to publications from multiple servers
As shown in Figure 27.3, this lends itself to the following scenario:
Suppose that you work for a company that sells auto parts and you need tokeep track of the inventory at all of the regional offices The servers at all ofthe regional offices can publish their inventory databases, and the server atcompany headquarters can subscribe to those subscriptions Now the folks
at company headquarters will know when a regional office is running low onsupplies, because headquarters has a copy of everyone’s inventory database
FIGURE 27.3
A single server can also subscribe to multiple publishers.
Multiple Publishers, Multiple Subscribers
In this model, each server is a publisher, and each server is a subscriber (see Figure 27.4).You may instantly think that this lends itself to merge replication, but that is not alwaysthe case This model can lend itself to other types of replication as well
For example, suppose that you work at a company that rents videos Each videostore needs to know what the other video stores have in stock so that when a cus-tomer wants a specific video, they can be instantly directed to a video store that has acopy of the desired video To accomplish this, each video store would need to publish
a copy of their video inventory, and each store would need to subscribe to the otherstores’ publications In this way, the proprietors of the video store would know what
Trang 20the other video stores have in stock If this is accomplished using transactional cation, there will be very little latency, because the publication would be updatedevery time a transaction takes place.
Many international companies need data replicated to all of their subsidiariesoverseas A company with headquarters in New York may need to have datareplicated to London, Frankfurt, and Rome, for example If the server in NewYork is both the publisher and the distributor, the process of replication wouldinvolve three very expensive long-distance calls: one to each of the three sub-scribers If you place a distributor in London, though, the publisher in NewYork would need to make only one call, to the distributor in London The dis-tributor would then make connections to the other European servers and there-fore save money on long-distance calls between servers
Trang 21FIGURE 27.5
A server can be dedicated to the task of distribution.
Heterogeneous Replication
Not all replication takes place between SQL Servers Sometimes you need to have
duplicate data on a Sybase, Oracle, Access, or other database server Heterogeneous replication is the process of replicating data from SQL Server to another type of data-
base system The only requirement for the subscriber in this case is that it must beOpen Database Connectivity (ODBC) compliant If the target is ODBC compliant, itcan be the recipient of a push subscription from SQL Server If you find that youneed to pull a subscription from SQL Server to a third-party database system, youwill need to write a custom program that accesses the SQL-DMO (Distributed Man-agement Objects) In this way, you can make a program that will pull a subscription
Trang 22Then you need a publisher on which to create articles and publications Finally youneed a subscriber to accept these publications.
The distributor will have a lot of work to do, especially if it is servicing more thanone publisher and more than one subscriber, so give it plenty of RAM (about 256MBshould do the trick) Also, all of the changes from the publishers are stored in one oftwo places: For transactional replication, all of the changes are stored in the distribu-tion database; for other types, the changes are stored in the distribution workingdirectory (\\distribution_server\Program Files\Microsoft SQL Server\MSSQL$
(instance)\REPLDATA), so make sure you have enough disk space to handle all of thechanges that will be flowing through the system
NOTE The distribution database stores changes and history for transactional tion; for all other types, the distribution database merely stores history—changes are stored
replica-in the distribution workreplica-ing folder
WARN I NG Because only administrators have access to the C$ share on any givenserver, the account used by the SQLServerAgent service needs to be an administrator onthe distribution server, or replication will fail
Once the distributor is ready to go, you can proceed with replication The first step
is to configure the distributor, which we will do now
NOTE For the exercises in this chapter, you will need to have two instances of SQLServer running To configure this, please see Appendix B
1 Open Enterprise Manager by selecting it from the SQL Server 2000 program
group under Programs on the Start menu
2 Select your default instance of SQL Server and then on the Tools menu, point to
Replication and click Configure Publishing, Subscribers and Distribution Thisstarts the Configure Publishing and Distribution Wizard Click Next
P A R T
VI
Trang 233 On the second screen, you are asked to select a distributor; this is where the
dis-tribution database and disdis-tribution working folder reside You will work with
the local server, so check the radio button labeled Make ‘Server’ Its Own
Distrib-utor and click Next
Trang 244 The next screen asks whether you would like to customize the distribution
server properties or let SQL Server do it for you If you want to place the ution database on a different hard disk than the default or you want to place thedistribution working folder elsewhere, you should customize the properties Inmost cases, customization is recommended, so select the radio button labeledYes, Let Me Set the Distribution Database Properties and click Next
distrib-5 On the next screen, you need to provide some information about the distribution
database: its name, data file location, and transaction log location It is best tohave the data file and transaction log on different physical hard disks for recover-ability, but for this exercise, accept all of the defaults and click Next to continue
P A R T
VI
Trang 256 The next screen allows you to enable publishers Enabling a publisher means
that it is allowed to use this server as a distributor This keeps unauthorizedservers from bogging down your distributor In this case, select both the primaryand SECOND servers (if you do not have a SECOND server, please refer toAppendix B)