Dynamic load balancing is also used to redistribute workloadamong the remaining nodes if a node fails or becomes isolated.. Clusters enable the scaling that isoften required to reallocat
Trang 2C H A P T E R 6
Processing, Load Control, and
Internetworking for Continuity
Until recent years, centralized network architectures using mainframe systems were
a staple in many IT environments They provided vast processing power and ally acquired fault-tolerant capabilities as well However, as distributed transactionprocessing requirements have heightened, mainframes were found to lack the versa-tility to support today’s real-time and dynamic software development and process-ing environment An Internet-based transaction, for instance, will often require theuse of several independent processors situated at different network locations Theneed for scalability in implementing high-performance computing has driven con-sideration of alternatives to centralized mainframe-based network architectures.This chapter reviews technologies and techniques that can be used to optimize sur-vivability and performance within a distributed internetworking environment
For mission-critical networks, cost-effective ways of ensuring survivability is always
an objective The concept of clusters is designed with this objective in mind A
clus-ter is a group of inclus-terrelated compuclus-ters that work together to perform various tasks.The underlying principal behind clusters is that several redundant computers work-ing together as single resource can do more work than a single computer and canprovide greater reliability Physically, a cluster is comprised of several computingdevices that are interconnected to behave as a single system Other computers in thenetwork typically view and interact with a cluster as if it was a single system Thecomputing elements that comprise a cluster can be grouped in different ways to dis-tribute load and eliminate single points of failure
Because multiple devices comprise a cluster, if one device fails in a cluster,
another device can take over The loss of any single device, or cluster node, does not
cause the loss of data or application availability [1] To achieve this capability,resources such as data and applications must either be replicated or pooled amongthe nodes so that any node can perform the functions of another if it fails Further-more, the transition from one node to another must be such that data loss and appli-cation disruption are minimized
Other than reliability, clustering solutions can be used to improve processing orbalance workload so that processing bottlenecks are avoided If high-performanceprocessing is required, a job can be divided into many tasks and spread among thecluster nodes If a processor or server is in overload, fails, or is taken off line for
113
Trang 3maintenance, other nodes in the cluster can provide relief In these situations, ters require that nodes have access to each other’s data for consistency Advances instorage technology have made sharing data among different systems easier toachieve (refer to the chapter on storage).
clus-Clustering becomes more attractive for large, distributed applications or tems Clusters can improve scalability because workload is spread among severalmachines Individual nodes can be upgraded or new nodes can be added to increasecentral processor unit (CPU) or memory to meet the performance growth andresponse time requirements This scalability also makes it more cost effective to pro-vide the extra computing capacity to guard against the unpredictable nature oftoday’s data traffic
sys-Cluster connectivity can be achieved in numerous ways Connecting servers over
a network supporting transmission control protocol/Internet protocol (TCP/IP) is avery common approach Another approach is to connect computer processors over ahigh-speed backplane They can be connected in various topologies, including star,ring, or loop Invariably, in each approach nodes are given primary tasks andassigned secondary nodes to automatically assume processing of those tasks uponfailure of the primary node The secondary node can be given tasks to do so that it iskept useful during normal operation or kept idle as a standby A reciprocatingarrangement can be made as well between the nodes so that each does the sametasks Such arrangements can be achieved at several levels, including hardware,operating system (OS), or application levels
Clusters require special software that can make several different computersbehave as one system Cluster software is typically organized in a hierarchical fash-ion to provide local or global operational governance over the cluster Softwaresophistication has grown to the point where it can manage a cluster’s systems, stor-age, and communication components An example is IBM’s Parallel Sysplex tech-nology, which is intended to provide greater availability [2, 3] Parallel Sysplex is atechnology that connects several processors over a long distance (40 km) using a spe-cial coupling facility that enables them to communicate and share data [4]
situations involving large volumes of users, super clusters can be constructed, which
is a static cluster comprised of dynamic clusters These types of configurations areillustrated in Figure 6.1
Each of these cluster types can be constructed in several ways using differenttechnologies The following list contains some of the most widely used technologyapproaches, illustrated in Figure 6.2:
Trang 46.1 Clusters 115
Cluster controller
Tasks A–M TasksN–ZCluster
Server A failure confines cluster to Tasks N–Z
Tasks A–Z Cluster
Server B continues all tasks upon server
A failure
Tasks A–Z
B
Dynamic clusters
Cluster B continues all tasks upon cluster A failure
Shared memory
Shared memory
Backplane
SMP
MPP
Multiprocessor clusters
CPUs CPUs A-C A-C
Memory Memory
Fault-tolerant systems
A-C A-C A-C
Server clusters
CPU assigned tasks Storage A-C
Figure 6.2 Examples of cluster technologies.
Trang 5• Multiprocessor clusters are multiple CPUs internal to a single system that can
be grouped or “clumped” together for better performance and availability
Standalone systems having this type of feature are referred to as sor or scale-up systems [6, 7] Typically, nodes perform parallel processing
multiproces-and can exchange information with each other through shared memory,
mes-saging, or storage input/output (I/O) Nodes are connected through a system area network that is typically a high-speed backplane They often use special
OSs, database management systems (DBMSs), and management software foroperation Consequently, these systems are commonly more expensive tooperate and are employed for high-performance purposes
There are two basic types of multiprocessor clusters In symmetric processing (SMP) clusters, each node performs a different task at the sametime SMPs are best used for applications with complex information process-ing needs [8] For applications requiring numerous amounts of the same orsimilar operations, such as data warehousing, massively parallel processing(MPP) systems may be a better alternative MPPs typically use off-the-shelfCPUs, each with their own memory and sometimes their own storage Thismodularity allows MPPs to be more scalable than SMPs, whose growth can belimited by memory architecture MPPs can be limitless in growth and typicallyrun into networking capacity limitations MPPs can also be constructed fromclusters of SMP systems as well
multi-• Fault-tolerant systems are a somewhat simplified hardware version of
multi-processor clusters Fault-tolerant systems typically use two or more redundantprocessors and heavily rely on software to enhance performance or manageany system faults or failures The software is often complex, and the OS andapplications are custom designed to the hardware platform These systems areoften found in telecom and plant operations, where high reliability and avail-ability is necessary Such systems can self-correct software process failures, orautomatically failover to another processor if a hardware or software failure iscatastrophic Usually, alarms are generated to alert personnel for assistance orrepair, depending on the failure In general, these systems are often expensive,requiring significant upfront capital costs, and are less scalable than multi-processor systems Fault-tolerant platform technology is discussed in moredepth in a later chapter of this book
• Server clusters are a low-cost and low-risk approach to provide performance
and reliability [9] Unlike a single, expensive multiprocessor or fault-tolerantsystem, these clusters are comprised of two or more less expensive servers thatare joined together using conventional network technology Nodes (servers) can
be added to the network as needed, providing the best scalability Large server
clusters typically operate using a shared-nothing strategy, whereby each node
processor has its own exclusive storage, memory, and OS This avoids memory
and I/O bottlenecks that are sometimes encountered using shared strategies.
However, shared-nothing strategies must rely on some form of mirroring or worked storage to establish a consistent view of transaction data upon failure.The following are some broad classes of cluster services that are worth noting.Each can be realized using combinations or variations of the cluster configurations
Trang 6net-and technologies just discussed Each successive class builds on the previous withregard to capabilities:
• Administrative clusters are designed to aid in administering and managing
nodes running different applications, not necessarily in unison Some go a stepfurther by integrating different software packages across different nodes
• High-availability clusters provide failover capabilities Each node operates as
a single server, each with its own OS and applications Each node has anothernode that is a replicate image, so that if it fails, the replicate can take over.Depending on the level or workload and desired availability, several failoverpolicies can be used Hot and cold standby configurations can be used toensure that a replicate node is always available to adequately assume anothernode’s workload Cold standby nodes would require extra failover time to ini-tialize, while hot standby nodes can assume processing with little, if any,delay In cases where each node is processing a different application, failovercan be directed to the node that is least busy
• High performance clusters are designed to provide extra processing power
and high availability [10] They are used quite often in volume and reliability processing, such as telecommunications or scientific applications
high-In such clusters, application workload is spread among the multiple nodes,
either uniformly or task specific They are sometimes referred to as parallel application or load balancing clusters For this reason, they are often found to
be the most reliable and scalable configurations
A prerequisite for high-availability or high-performance clusters is access to thesame data so that transactions are not lost during failover This can be achievedthrough many of the types of storage techniques that are described in the chapter onstorage Use of mirrored disks, redundant array of independent disks (RAID), ornetworked storage not only enable efficient data sharing, but also eliminate singlepoints of failure Dynamic load balancing is also used to redistribute workloadamong the remaining nodes if a node fails or becomes isolated Load balancing isdiscussed further in this chapter
6.1.2 Cluster Resources
Each node in a cluster is viewed as an individual system with a single image Clusterstypically retain a list of member nodes among which resources are allocated Nodescan take on several possible roles, including the primary, secondary, or replicateroles that were discussed earlier Several clusters can operate in a given environment
if needed, where nodes are pooled into different clusters In this case, nodes are keptaware of nodes and resources within their own cluster and within other clusters aswell [11]
Many cluster frameworks use an object-oriented approach to operate ters Objects can be defined comprised of physical or logical entities called
clus-resources A resource provides certain functions for client nodes or other clus-resources.
They can reside on a single or multiple nodes Resources can also be groupedtogether in classes so that all resources in given class can respond similarly upon a
Trang 7failure Resource groups can be assigned to individual nodes Recovery
configura-tions, sometimes referred to as recovery domains, can be specified to arrange objects
in a certain way in response to certain situations For example, if a node fails, adomain can specify to which node resources or a resource group’s work should betransferred
6.1.3 Cluster Applications
For a node to operate in a cluster, the OS must have a clustering option more, many software applications require modifications to take advantage of clus-tering Many software vendors will offer special versions of their software that are
Further-cluster aware, meaning that they are specifically designed to be managed by Further-cluster
software and operate reliably on more than one node Cluster applications are ally those that have been modified to failover through the use of scripts Thesescripts are preconfigured procedures that identify backup application servers andconvey how they should be used for different types of faults Scripts also specify thetransfer of network addresses and ownership of storage resources Because failovertimes between 30s and 5 min are often quoted, it is not uncommon to restart anapplication on a node for certain types of faults, versus failing over to another proc-essor and risking transaction loss
usu-High-volume transaction applications, such as database or data warehousingand Web hosting, are becoming cluster aware Clusters enable the scaling that isoften required to reallocate application resources depending on traffic intensity.They have also found use in mail services, whereby one node synchronizes accountaccess utilization by the other nodes in the cluster
6.1.4 Cluster Design Criteria
Cluster solutions will radically vary among vendors When evaluating a clusteredsolution, the following design criteria should be applied:
• Operating systems This entails what OSs can be used in conjunction with the
cluster and whether different versions of the OS can operate on differentnodes This is critical because an OS upgrade may entail having different ver-sions of an OS running in the cluster at a given moment
• Applications The previous discussion highlighted the importance of
cluster-aware applications In the case of custom applications, an understanding ofwhat modifications are required needs to be developed
• Failover This entails to what extent failover is automated and how resources
are dynamically reallocated Expected failover duration and user transparency
to failovers needs to be understood Furthermore, expected performance andresponse following a failover should be known
• Nodes A number of nodes should be specified that could minimize the impact
of a single node outage An N + I approach is often a prudent one, but can
result in the higher cost of an extra, underutilized cluster node A single systemimage (SSI) approach to clustering allows the cluster nodes to appear andbehave as a single system, regardless of the quantity [12]
Trang 8• Storage Cluster nodes are required to share data Numerous storage options
and architectures are available, many of which are discussed in the chapter onstorage Networked storage is fast becoming a popular solution for nodes toshare data through a common mechanism
• Networking Cluster nodes must communicate with each other and other
nodes external to the cluster Separate dedicated links are often used for thenodes to transmit heartbeat messages to each other [13]
6.1.5 Cluster Failover
Clusters are designed such that multiple nodes can fail without bringing down theentire cluster Failover is a process that occurs when a logical or physical clustercomponent fails Clusters can detect when a failure occurs or is about to occur.Location and isolation mechanisms typically can identify the fault Failover is notnecessarily immediate because a sequence of events must be executed to transferworkload to other nodes in the cluster (Manual failover is often done to permit sys-tem upgrades, software installation, and hardware maintenance with data/applica-tions still available on another node.) To transfer load, the resources that werehosted on the failed node must transfer to another node in the cluster Ideally, thetransfer should go unnoticed to users
During failover, an off-line recovery process is undertaken to restore the failednode back into operation Depending on the type of failure, it can be complex Theprocess might involve performing additional diagnostics, restarting an application,replacing the entire node, or even manually repairing a failed component within thenode Once the failed node becomes active again, a process called failback movesthe resources and workload back to the recovered node
There are several types of cluster failover, including:
• Cold failover This is when a cluster node fails, another idle node is notified,
and applications and databases are started on that node This is typicallyviewed as a slow approach and can result in service interruption or transac-tion loss Furthermore, the standby nodes are not fully utilized, making this amore expensive approach
• Warm failover This is when a node fails, and the other node is already
opera-tional, but operations must still be transferred to that node
• Hot failover This is when a node fails, and the other node is prepared to serve
as the production node The other node is already operational with tion processing and access to the same data as the failed node Often, the sec-ondary node is also a production server and can mirror the failed server.Several activities occur to implement a complete failover process The following
applica-is a general description of the types of events that take place Thapplica-is process will varywidely by the type of cluster, cluster vendor, applications, and OS involved:
• Detection Detection is the ability to recognize a failure A failure that goes
undetected for a period of time could result in severe outage A sound tion mechanism should have wide fault coverage so that faults can be detected
Trang 9and isolated either within a node or among nodes as early as possible The ity of a system to detect all possible failures is measured in its fault coverage.
abil-Failover management applications use a heartbeat process to recognize a
fail-ure Monitoring is achieved by sending heartbeat messages to a special toring application residing on another cluster node or an external system.Failure to detect consecutive heartbeats results in declaration of a failure andinitiation of a failover process Heartbeat monitoring should not only test fornode failure but should also test for internode communication In addition tothe network connectivity used to communicate with users, typically Ethernet,
moni-some clusters require a separate heartbeat interconnect to communicate with
other nodes
• Networking A failover process typically requires that most or all activity be
moved from the failed node to another node Transactions entering and ing the cluster must then be redirected to the secondary node This may requirethe secondary node to assume the IP address and other relevant information inorder to immediately connect users to the application and data, without reas-signing server names and locations in the user hosts If a clustering solutionsupports IP failover, it will automatically switch users to the new node; other-wise, the IP address needs to be reallocated to the backup system IP failover inmany systems requires that both the primary and backup nodes be on the sameTCP/IP subnet However, even with IP failover, some active transactions orsessions at the failed node might time out, requiring users to reinitiate requests
leav-• Data Cluster failover assumes that the failed node’s data is accessible by the
backup node This requires that data between the nodes is shared, structed, or transferred to the backup node As in the case of heartbeat moni-toring, a dedicated shared disk interconnect is used to facilitate this activity.This interconnect can take on many forms, including shared disk or disk arrayand even networked storage (see Section 6.1.7) Each cluster node will mostlikely have its own private disk system as well In either case, nodes should beprovided access to the same data, but not necessarily share that data at any sin-gle point in time Preloading certain data in the cache of the backup nodes canhelp speed the failover process
recon-• Application Cluster-aware applications are usually the beneficiary of a
failo-ver process These applications can be restarted on a backup node Theseapplications are designed so that any cluster node can resume processing upondirection of the cluster-management software Depending on the application’sstate at the time of failure, users may need to reconnect or encounter a delaybetween operations Depending on the type of cluster configuration in use,performance degradation in data access or application accessing might beencountered
6.1.6 Cluster Management
Although clusters can improve availability, managing and administering a cluster can
be more complex than managing a single system Cluster vendors have addressed thisissue by enabling managers to administer the entire cluster as a single system versusseveral systems However, management complexity still persists in several areas:
Trang 10• Node removal Clustering often allows deactivating a node or changing a
node’s components without affecting application processing In heavy loadsituations, and depending on the type of cluster configuration, removal of acluster node could overload those nodes that assume the removed node’sapplication processing The main reason for this is that there are less nodesand resources to sustain the same level of service prior to the removal Fur-thermore, many users attempt to reconnect at the same time, overwhelming anode Mechanisms are required to ensure that only the most critical applica-tions and users are served following the removal Some cluster solutions pro-vide the ability to preconnect users to the backup by creating all of the neededmemory structures beforehand
• Node addition In most cases, nodes are added to an operational cluster to
restore a failed node to service When the returned node is operational, it must
be able to rejoin the cluster without disrupting service or requiring the cluster
to be momentarily taken out of operation
• OS migration OS and cluster software upgrades will be required over time If
a cluster permits multiple versions of the same OS and cluster software to run
on different nodes, then upgrades can be made to the cluster one node at a
time This is often referred to as a rolling upgrade This capability minimizes
service disruption during the upgrade process
• Application portability Porting cluster applications from one node to another
is often done to protect against failures Critical applications are often spreadamong several nodes to remove single points of failure
• Monitoring Real-time monitoring usually requires polling, data collection,
and measurement features to keep track of conditions and changes acrossnodes Each node should maintain status on other nodes in the cluster andshould be accessible from any node By doing so, the cluster can readily recon-figure to changes in load Many cluster-management frameworks enable theadministration of nodes, networks, interfaces, and resources as objects Datacollection and measurement is done on an object basis to characterize theirstatus Management is performed by manipulation and modification of theobjects
• Load balancing In many situations, particularly clustered Web servers, traffic
must be distributed among nodes in some fashion to sustain access and formance Load balancing techniques are quite popular with clusters and arediscusses further in this chapter
per-6.1.7 Cluster Data
Data access can be a limiting factor in cluster implementation Limited storagecapacity as well as interconnect and I/O bottlenecks are often blamed for perform-ance and operational issues The most successful cluster solutions are those thatcombine cluster-aware databases with high-availability platforms and networkedstorage solutions
Shared disk cluster approaches offer great flexibility because any node canaccess any block of data, providing the maximum flexibility However, onlyone node can write to a block of data at any given time Distributed locking
Trang 11management is required to control disk writes and eliminate contention for cacheddata blocks across nodes Distributed locking, however, can negatively impact I/Operformance Partitioned cluster databases require that transactions be balancedacross cluster nodes so that one node is not overloaded Balancing software can beused to direct I/O queries to the appropriate server as well as realign partitionsbetween nodes.
In shared-nothing data approaches, each cluster node has exclusive access to astatic, logical segment of data This eliminates the need for locking and cache-contention mechanisms, which consume performance This is why shared-nothing isoften preferred in large data warehouses and high-volume transaction applications
On the other hand, shared nothing requires reallocation of data and new partitions
if nodes are added or removed from the cluster
Cluster database solutions must ensure that that all committed database updates
made prior to a failure are applied (referred to as a roll forward) and that all mitted updates are undone during a recovery (referred to as a roll back) The roll-
uncom-forward process is less intensive with greater snapshot frequency of the database
6.1.8 Wide Area Clusters
Wide area clusters are desirable in enterprises for greater diversity and ity (see Figure 6.3) These clusters are designed with simultaneous application proc-essing on all cluster nodes at all sites Applications are operated without regard tothe physical location of the platform If an outage occurs at one site, operations con-tinue at the other site Concurrent access to the same data image from all sites isachieved through a variety of techniques, including mirroring and networked stor-age Coordination is required to manage the data access from each site [14] All sitesare interconnected via a wide area network (WAN) As in a collocated cluster,mechanisms are required to detect failures at the remote site and failover to surviv-ing site This requires synchronized use of cluster software management, networkrouting, load balancing, and storage technologies
manageabil-To achieve invisible failovers using an integration of multivendor componentsmight be quite challenging However, vendor-specific solutions are available to
Networked storage Location A Location B
Wide area cluster
WAN
Figure 6.3 Wide area cluster example.
Trang 12achieve high availability in wide area clusters An example is IBM’s MultiSite andGeographically Dispersed Parallel Sysplex (GDPS) clustering technology It involvesconnecting S/390 systems via channel-attached fiber cabling and a coupling facility.This enables controlled switching from one system to another in the event of anunplanned or planned service interruption [15] A coupling facility is an externalsystem that maintains nonvolatile shared memory among all processors, enablingthem to share data and balance the workload among applications.
A Parallel Sysplex cluster can be separated by up to 40 km, where each site isconfigured with redundant hardware, software, connections, and mirrored data.Standard or user-defined site switches can be executed If a system fails, it is auto-matically removed from the cluster and restarted If a CPU fails, workload onanother is initiated Mission-critical production and expendable workloads can beconfigured among sites depending on organizational need For networking, thereare several options Previously, a technique called data link switching (DLSw), ameans of tunneling SNA traffic over an IP network, was used for recovery and net-work failures Recently, enterprise extender functions have been introduced thatconvert SNA network transport to IP The systems can use a virtual IP address(VIPA) to represent the cluster to outside users
Load balancing is a class of techniques used to direct queries to different systems for
a variety or reasons, but fundamentally to distribute a workload across some able pool of resources Load balancing is best used in situations involving large vol-umes of short-lived transactions or in networks with a large numbers of usersaccessing a small quantity of relatively static information This is why it has foundpopular use in front of Web servers, clusters, and application server farms It is alsoused to direct and balance frequent transaction requests among applications involv-ing data or content that is not easily cached
avail-In the case of Web sites, load balancing can be used to assure that traffic volumewill not overwhelm individual Web servers or even individual server farms It per-mits distributing load to another site, creating redundancy while sustaining per-formance Load balancing is effective on sites where no transactions are involvedand when most of the site hits access a small number of pages without the use ofhyperlinks to other servers Data and content between load balanced sites can bepartitioned, mirrored, or overlap in some manner so that each site processes thesame or different portions of the transactions, depending on the nature of theapplication
Load balancers are devices that distribute traffic using a number of differentmethods They provide numerous benefits: they can alleviate server system bottle-necks by redirecting traffic to other systems; they provide scalability to add capacityincrementally over time and utilize a mix of different systems; they offer anapproach to preserve investment in legacy systems and avoid upfront capital expen-ditures; they provide the ability to leverage redundant systems for greater availabil-ity and throughput; they obviate the need for bandwidth and system memoryupgrades to resolve performance bottlenecks; and they can be used to improveoperations management by keeping processes running during routine maintenance
Trang 13There are a several ways to classify load balancers Two classifications are
illus-trated in Figure 6.4 One class consists of network load balancers, which distribute
network traffic, most commonly TCP/IP traffic, across multiple ports or host nections using a set of predefined rules Network load balancers originated asdomain name servers (DNSs) that distributed hypertext transfer protocol (HTTP)sessions across several IP hosts They used basic pinging to determine whether desti-nation hosts were still active in order to receive queries Later, this capability wasexpanded to measure destination server performance prior to forwarding additionalrequests to avoid overwhelming that host With the advent of e-commerce, load bal-ancers were further enhanced with capabilities to monitor both front-end and back-end servers They direct traffic to back-end servers based on requests from front-end
con-servers and use a process called delayed binding, which maintains the session with
that server until data or content is received from it prior to making a decision
Another class consists of component load balancers, which distribute requests
to applications running across a cluster or server farm Component load balancersare standalone systems typically situated between a router and internal server farmthat distribute incoming traffic among the servers based on predefined rules Rulescan involve anything from routing based on application, server response time, delay,time of day, number of active sessions, and other metrics Some rules might requiresoftware agents to be installed on the servers to collect the required information toimplement a rule Load balancers deployed in front of server farms or clusters canuse a single VIPA for the entire site, making the site appear as a single system to theoutside world Because these systems can be a single point of failure between a serverfarm and external network, failover capabilities are required, including use of fault-tolerant platforms
Load balancers can be used in this fashion with respect to clusters Cluster nodesare usually grouped according to application, with the same applications runningwithin a cluster Load balancers can use predefined rules to send requests to eachnode for optimal operation An example is sending requests to the least-used node Itcan also be used in innovative ways For example, load balancers can be used in con-junction with wide area clusters to direct requests to the cluster closest to the user
Trang 14Not only does balancing help control cluster performance, it also enables the tion or removal of cluster nodes by redirecting traffic away from the affected node.Furthermore, load balancers can also be used to manage cluster storage intercon-nects to reduce I/O bottlenecks.
addi-6.2.1 Redirection Methods
Load balancer devices inspect packets as they are received and switch the packetsbased on predefined rules The rules can range from an administratively defined pol-icy to a computational algorithm Many of the rules require real-time informationregarding the state of a destination server Several methods are used to obtain suchinformation Balancers that are internal to an enterprise often make use of existingsystem-management tools that monitor applications and platform status throughapplication programming interfaces (APIs) and standard protocols Some balancersrequire direct server measurements through the use of software agents that areinstalled on the server These agents collect information regarding the server’shealth and forward this information to the load balancer The agents are usuallydesigned by the load balancer vendor and are often used on public Web sites Theinformation that agents collect can be quite detailed, usually more that what would
be obtained in a PING request [16] For this reason, they can consume some of theserver’s CPU time
Many Web sites have a two-tier architecture (Figure 6.5), where a first tier ofcontent bearing Web servers sits in front of a back end or second tier of servers.These second-tier servers are not directly load balanced, and sometimes a first-tierserver can mask a second-tier server that is down or overloaded In this case, send-ing requests to the first-tier server would be ineffective, as the second-tier serverwould be unable to fulfill the request To handle these situations, some load balanc-
ers can logically associate or bind the back-end servers to the first-tier server and
query them for the status information via the first-tier server
One would think that the purpose of using a load balancer is to direct moreCPU-intensive traffic to the servers that can handle the load This is true, but thereare occasions whereby other rules may be of more interest [17] Rules can be classi-
fied as either static or dynamic Static rules are predefined beforehand and do not
Servers A and B logically bound
to obtain status
Front tier Back tier
Figure 6.5 Load balancing with multitier Web site.
Trang 15change over time, while dynamic rules can change over time Some rule examplesinclude:
• Distributed balancing splits traffic among a set of destinations based on
prede-fined proportions or issues them in a predeprede-fined sequence For example, somebalancers perform round-robin balancing, whereby an equal number ofrequests are issued in sequential order to the destination servers This routingworks fairly well when Web content if fairly static
• User balancing directs traffic based on attributes of the user who originated
the request They can examine incoming packets and make decisions based onwho the user is One example is directing requests to a server based on userproximity Another example is providing preferential treatment to a Web site’sbest customers
• Weight or ratio balancing directs traffic based on predefined weights that are
assigned to the destination servers The weights can be indicative of sometransaction-processing attribute of the destination server For example, theweight can be used to bias more traffic to servers having faster CPUs
• Availability balancing checks to see if destination servers are still alive to avoid
forwarding that would result in error messages
• Impairment balancing identifies when all or some facilities of a server are
down and avoids traffic to that server Although one can connect to a server, apartial failure can render a server or application useless Furthermore, a serverunder overload would also be ineffective
• Quality of service (QoS) balancing measures roundtrip latency/delay between
the destination and a user’s DNS server to characterize network transportconditions
• Health balancing monitors the workload of a destination server and directs
traffic to servers that are least busy [18] Different measurements andapproaches are used to characterize this state For example, some measure-ments include the number of active TCP connections and query response time.Standalone measures can be used or several measures can be combined to cal-culate an index that is indicative of the server’s health
• Content-aware balancing directs requests based on the type of application
(e.g., streaming audio/video, static page, or cookie) and can maintain theconnection state with the destination server Traffic can be redirected based on
back-end content as well using delayed binding This involves the load
bal-ancer making the redirection decision only until it receives content from theWeb server For streaming video applications, balancers will direct all requestsfrom a user to the video server for the entire session Content-aware balancingcan provide greater flexibility in handling content across Web servers andenables placing different content on different machines
There are two basic forms of redirection—local and global Each can use specialredirection rules that include some of the aforementioned:
• Local load balancing Local load balancing or redirection involves using a
load-balancer device that is local to a set of users, servers, or clusters
Trang 16Component or network load balancers can be used It is used to route requestsfrom a central location across a group of servers or hosts that typically sit onthe same local area network (LAN), subnet, or some other type of internal net-work They are typically used to route requests across systems residing within
a data center Local load balancing can also be used to distribute requestsoriginating from internal network hosts across firewalls, proxy servers, orother devices It is often seen as a way to manage traffic across a server com-plex, such as those that host a Web site
• Global load balancing Global load balancing involves directing traffic to a
variety of replicated sites across a network, as in the case of the Internet [19].Redirection decisions are made centrally from devices that reside outside anetwork that intercept requests for content prior to reaching a firewall(Figure 6.6) and direct those requests to an appropriate location It works bestwhen caches are distributed throughout a network, but each destination doesnot have dedicated cache
Global redirection has become an integral part of mission-critical working solutions for data-center solutions and Web-based applications [20]
net-It enables organizations to distribute load across multiple production sites andredirect traffic appropriately following the outage or overload of a particularsite
Global load balancers come in different forms and can be configured inmultiple ways They can very well be consolidated with local load balancers.They can be configured using DNS or even border gateway protocols (BGPs)(some network load balancers might require configuration with contiguous IPaddresses) Throughput can vary among products, but it is important that theycan scale up throughput with the volume of requests
Global and local load balancing can be used in conjunction with each other todistribute traffic across multiple data centers (see Figure 6.7) Each data center canhave a local load balancer identified by a VIPA The balancers would distribute traf-fic to other centers using the VIPAs of the remote balancers, as if they are localdevices The DNS requests for the Web site hosted by either data center must be
Internet 1
2
User
Web site (location A)
Web site (location B)
1 User issues content request
2 Request sent to load balancer
3 Load balancer redirects request
to either location A or B
3
Figure 6.6 Global load balancing example.
Trang 17directed to the domain or VIPAs of the load balancers This requires leveraging DNSand HTTP capabilities to send users to the most efficient and available data center.
information of their locations Another DNS approach is called triangulation,
whereby after a user request is directed to multiple proxy sites, the site having thefastest response is used It is used to provide DNS devices with higher throughput andgreater protocol transparency DNS redirection is mainly used in conjunction withHTTP redirection, whereby HTTP header information is used to redirect traffic Theapproach does not work for non-HTTP traffic, such as file transfer protocol (FTP)
balancer
Local balancer
Local balancer
VIPA B VIPA A
1 2 Request sent to global balancer
Unhealthy server
Figure 6.7 Combined global and local load balancing example.
Trang 18traffic, a common approach used is to have the load balancer proxy the SSL server
so that it can maintain a “sticky” connection to an assigned SSL server Thisinvolves SSL requests remaining decrypted prior to reaching the load balancer Theload balancer retains its own VIPA as an SSL-processing resource The load bal-ancer then redirects the request to the IP address of the site’s SSL servers The SSLaddress inside the user’s cookie information, whose current value is the balancer’sVIPA, is then modified to the address of the SSL server This forces the balancer tocontinue redirecting successive transactions to the SSL server for the duration of thesession, even if the user requests different Web pages The balancer can also imple-ment secure session recovery by trying to reconnect with the SSL server if the session
is disrupted while maintaining the session with the user
6.2.4 Cookie Redirection
As previously mentioned, user balancing involves making redirection decisions based
on user attributes Because user IP address information can change and IP headerscontain minimal user information, higher layer information is often required Cook-ies, data that applications use to gather information from a user, serve this purpose.Cookie-based redirection is designed to make redirection decisions based on techni-cal and/or business objectives Requests from users that represent good-paying orimportant customers can be given preferential treatment Another approach used is
to redirect users based on their type of access so that users with slower speed accesscan be redirected to sites with faster servers Some balancers allow cookies to bealtered for certain applications, as in the case of SSL For example, a cookie could bemodified with a customer service location to initiate a live audio/video session with
an agent Cookie redirection, however, can negatively impact a load balancer’s formance, depending on how deep into a cookie it must look
per-6.2.5 Load Balancer Technologies
Load balancers are produced using several approaches to implementations, whichcan be categorized into the following basic groups:
• Appliances are devices that are optimized to perform a single function, versus
using software installed on a general-purpose server They are often more costeffective to use, have built-in reliability, and are easier to maintain Load-balancer appliances are usually placed between a router and switch and areused often for local balancing application They provide better price perform-ance than server-based balancing because they rely on distributed processingand application-specific integrated circuits (ASICs) These devices are cur-rently quite popular
• Software-based balancing is accomplished by software that is resident on a
general-purpose server Because of this, processing can be slower thanhardware-based balancing On the other hand, software-based solutions pro-vide greater flexibility and can be more easily upgraded This is makes it easier
to keep up with new software releases, especially those where new agents areintroduced They also offer standard programming interfaces for use bythird-party or custom applications Last, they can simplify network topology,
Trang 19especially if the server is used for other functions The servers are typicallyequipped with dual network adapters—one that connects to a router andanother that connects to a switch or hub that interconnects with other servers.
• Switch-based balancers are just that—a network switch or router platform that
has load-balancing capabilities Because switches typically sit in locations tral to users, servers, and the Internet, it makes them a prime candidate for loadbalancing Like appliances, they can use ASICs so that balancing is done at wirespeed For example, ports on a LAN switch that connect to a server farm can bedesignated for load balancing and treated as one network address The switchthen distributes traffic using rules that are defined through configuration
cen-• Server switching is a new approach that recreates the application session
man-agement and control functions found in mainframe front-end processors, but
it applies these techniques to distributed servers and server farms The conceptdelivers three big benefits: it increases individual server efficiency by offload-ing CPU-intensive chores; it scales application-processing capacity by trans-parently distributing application traffic; and it ensures high levels of serviceavailability Server switches achieve application-based redirection by imple-menting advanced packet filtering techniques Filters can be configured based
on protocols, IP addresses, or TCP port numbers, and they can be applieddynamically to a switch port to permit, block, or redirect packets They canalso be used to select packets whose headers or content can be replaced withapplication-specific values By combining load balancing and filtering withinserver switches, virtually any IP traffic type can now be load balanced Thismeans administrators can redirect and load balance traffic to multiple fire-walls and outbound routers, so standby devices no longer sit idle Serverswitches offer this capability by examining incoming packets and making adetermination about where they should send them based on source IP address,application type, and other parameters This is why vendors are trying to
avoid having the term load balancer applied to their server switch offerings.
The issue is not just distributing loads of like traffic across multiple CPUs, butrequires distinguishing and prioritizing various types of traffic and ensuringthat each one is supported by resources appropriate to the business value itrepresents For example, layer 7 switches look at layers 2 thru 7 of the IPpacket, recognize cookies, and treat them accordingly These platforms areapplication aware, have powerful load-balancing capabilities, and can do geo-graphic redirection as needed
• Switch farms are a manual approach to isolating and balancing traffic among
workgroups [22] It involves connecting servers directly to user’s switches tooffload backbone traffic An example is illustrated in Figure 6.8 The network
is designed so that traffic is kept local as much as possible Workgroup cation servers are directly connected to the switches that service their users.Core switches support only enterprise services, minimizing backbone trafficand freeing it up for critical traffic This approach is counter to the concept ofserver farms, which are usually situated in a central location with the goal ofreducing administration However, server farms can increase backbone trafficand can be a single point of failure Switch farms do require cable management
appli-so that copper limitations are not exceeded
Trang 206.2.6 Load Balancer Caveats
Despite all of the advantages that load balancing can provide, there are several ats that have become apparent with their use:
cave-• If not properly managed and configured, load balancers can bring down a site
or a system For example, a mistyped IP address can result in a catastrophicsituation Numerous erroneous requests, say for nonexistent pages or content,can overload a device and bring it to a halt
• Reliance on single standalone measures of server health can deceive load ancers about the server’s status For example, although a server’s HTTP dae-mon may fail, the server can still appear as alive and respond to PINGrequests
bal-• Load balancing works best when transactions are simple and short and data
or content is relatively static and easily replicated across locations Overlycomplex transactions can pose management headaches and increase opera-tions costs, offsetting the potential savings gained from the scalability loadbalancing provides
• Unless balancers are used that can handle multitier Web sites, load balancing
is ineffective against back-end server overload
• Although load balancers can be used to protect against outages, a balancer device itself can be a single point of failure Thus, they should beimplemented either on a high-availability or fault-tolerant platform, or used
load-in conjunction with a mated load balancer for redundancy
• As of this writing, it is still unclear how load balancing aligns or interworkswith QoS mechanisms (discussed further in this book) Although QoS can
Local traffic
Figure 6.8 Switch farm load balancing example.
Trang 21provide preferential treatment and guarantee service levels for network traffic,
it is ineffective if the traffic destination cannot provide the service The two can
be used as complementary techniques—QOS, especially traffic prioritization,can police and shape traffic at ingress points to improve traffic performanceand bandwidth utilization, whereas load balancing is generally used toimprove transaction rates
Today’s Internet evolved in a deregulated environment in the span of 10 years.Many of us have firsthand experienced the frustration of trying to access a Web siteonly to have it take forever to download The Internet operates using TCP/IP net-working, which is connectionless and is designed to slow end systems down as trafficincreases Packet transmission is slowed at the originating end points so that theintermediate nodes and destination hosts can keep up, until buffers are filled andpackets are discarded Yet, many large enterprises are now migrating their business-critical processes to this kind of environment The next sections describe some prac-tices designed to provide Web-based applications the ability to withstand the irregu-larities of the Internet
6.3.1 Web Site Performance Management
Users typically expect to be able to access a Web site when they want it They alsoexpect a Web site to be viewed and browsed easily and quickly, regardless of wherethey are and how they are connecting to the Internet Unfortunately, these expecta-tions are the basis of the frustrations of using the Web When such frustrations sur-face, they are usually directed to a Web site’s owner or Internet service provider(ISP) Web sites are now considered a “window to an enterprise,” thus poor per-formance as well as poor content can tarnish a firm’s image
Use of the Web for business-to-business 7× 24 transactions has even heightenedthe need for Web sites and their surrounding applications to be available all of thetime Experience has shown that most Web sites under normal conditions can sus-tain reasonable performance However, their resiliency to swift, unexpected trafficsurges is still lacking Consumer-oriented sites are typically visited by users who domore browsing than buying
Business-to-business sites handle more transaction-oriented traffic in addition
to buying The term transaction is often synonymous with higher performance
requirements Transactions often require SSL encryption as well as hypertextmarkup language (HTML) browser-based traffic From our earlier discussion,
we saw that SSL requires more processing and reliability resources Many timesthe back-end network situated behind a Web site is often affected when prob-lems arise The following are some broad categories of problems that are oftenexperienced:
• Internet service providers Losing a connection to a site is one of the leading
causes of download failures or site abandonment Access network tivity typically consumes about half of the time to connect to a Web site