75 Figure 4.16 Experimental results of 64 nodes in multiple subnets of a ring topology .... 77 Figure 4.17 Experimental results of 128 nodes in multiple subnets of a ring topology .... 7
Trang 1Dissertation
Novel Software-based Fault-Tolerant Schemes
for Large-scale Ethernet Networks 대규모 이더넷 망을 위한 새로운 소프트웨어
Trang 2Novel Software-based Fault-Tolerant Schemes
for Large-scale Ethernet Networks
Submitted in partial fulfillment of the requirements for the Ph.D degree in Information and Communications
Trang 3Novel Software-based Fault-Tolerant Schemes
for Large-scale Ethernet Networks
Graduate School, Myongji University Department of Information and Communications Engineering
Hoang-Anh Pham
We hereby recommend that the dissertation by the above candidate for the Ph.D degree in Information and
Communications Engineering be accepted
Chair, Evaluation Committee Humor Hwang Name Signature Member, Evaluation Committee Yoan Shin Name Signature Member, Evaluation Committee Jin Young Kim Name Signature Member, Evaluation Committee Cheol Woo You Name Signature Member, Evaluation Committee Jong Myung Rhee Name Signature
December, 2013
Trang 4Acknowledgement
First and foremost, the author would like to express his attitude towards his advisor, Prof Jong Myung Rhee, for great support and valuable advice during his stays in Korea The author was quite fortunate to again a great deal of experiences from his advisor
Second, the author would like to send many thanks to all evaluation committee Professors for their valuable comments that have helped him significantly improve the quality of this dissertation Also, the author would like to thank all Professors at the Department of Information Communications Engineering, MYONGJI University who helped him complete this dissertation
The author greatly appreciates Dr Se Mog Kim and all colleagues at the Myongji Ubiquitous and Convergence Laboratory (MUCL) for sharing everything They are always ready to help the author with kindness
Last but not the least; the author would like to dedicate this dissertation to his parents and his sister because of their unconditional love and support in a very long time
Trang 5i
Table of Contents
Table of Contents i
List of Figures iv
Table of Tables viii
Abstract ix
Chapter 1 Introduction 1
1.1 Fault-tolerance Ethernet (FTE) 1
1.2 Media Redundancy 2
1.3 Conventional FTE Approaches 5
1.4 Dissertation Motivation 6
1.5 Dissertation Outline 7
Chapter 2 Redundancy Protocols 9
2.1 Introduction 9
2.2 Existing Redundancy Protocols 10
2.2.1 Spanning Tree Protocol 10
2.2.2 Rapid/Multiple Spanning Tree Protocol 12
2.2.3 Media Redundancy Protocol 13
2.2.4 Parallel Redundancy Protocol 15
2.2.5 High-Availability Seamless Redundancy Protocol 18
Trang 6ii
2.3 Discussions 20
Chapter 3 SAFE Scheme 22
3.1 Introduction 22
3.1.1 Conventional Heartbeat-based Mechanism 23
3.1.2 Scalability Problem 25
3.2 Multiple Subnets Architecture 27
3.2.1 Reduction of Network Traffic Generated by Heartbeat Messages 29
3.2.2 Network Capability Enhancement to Overcome Multiple Points of Failure 32
3.2.3 Drawback of Multiple Subnets Architecture 33
3.2.4 Practical Issue in Subnet Division 35
3.3 Master Node Election 35
3.3.1 Election Algorithm 36
3.3.2 State Transition Diagram 39
3.3.3 Practical Issue 41
3.4 Performance Evaluation 41
3.4.1 Practical Demonstration 41
3.4.2 Experimental Establishment 45
3.4.3 Experimental Results 46
3.4.4 Discussion 47
Chapter 4 RSAFE Scheme 51
Trang 7iii
4.1 Introduction 51
4.2 Fast Fault-detection Algorithm 53
4.2.1 Description 53
4.2.2 Analysis 56
4.2.3 Discussion 64
4.3 Performance Evaluation 66
4.3.1 Experimental Descriptions 67
4.3.2 Experimental Results 70
Chapter 5 Summary and Conclusions 83
References 86
초록 94
Trang 8iv
List of Figures
Figure 1.1 A network with network-level redundancy 4
Figure 1.2 A network with device-level redundancy 4
Figure 1.3 A network with both device-level and network-level redundancies 5
Figure 2.1 An example network to illustrate the STP 11
Figure 2.2 A ring network using the MRP 14
Figure 2.3 Logical structure of the network using the MRP in non-failure state 14
Figure 2.4 Logical structure of the network using the MRP in failure state 15
Figure 2.5 Two independent networks using the PRP in form of a star topology 17
Figure 2.6 A scenario of two link failures in the network using the PRP 17
Figure 2.7 A ring network using the HSR 19
Figure 2.8 A scenario of two link failures in the network using HSR 19
Figure 3.1 DFTE network model 22
Figure 3.2 Conventional heartbeat-based algorithm 24
Figure 3.3 The number of nodes vs the heartbeat interval time 26
Figure 3.4 Multiple subnets in form of a star topology 28
Figure 3.5 The reduction ratio vs the number of subnets 32
Trang 9v
Figure 3.6 A scenario of multiple points of failure in a large-scale network 33
Figure 3.7 An example of link failures between inter-subnet switches 34
Figure 3.8 Multiple connections between two switches using EtherChannel 34
Figure 3.9 Two master nodes election algorithm 38
Figure 3.10 State transition diagram 39
Figure 3.11 Format of an entry conn[K] 42
Figure 3.12 Practical architecture of the SAFE scheme at kernel level 44
Figure 3.13 Practical network configuration for the SAFE demonstration 45
Figure 3.14 The best case of fault detection in SAFE scheme 48
Figure 3.15 The worst case of fault detection in SAFE scheme 49
Figure 3.16 Comparison of the SAFE scheme and the conventional heartbeat-based mechanism in terms of failure switchover delay 50
Figure 4.1 Multiple subnets in form of a ring topology 52
Figure 4.2 Heartbeat messages sending (HBM-TX) in RSAFE scheme 54
Figure 4.3 Heartbeat messages reception (HBM-RX) in RSAFE scheme 55
Figure 4.4 The best case of fault detection in RSAFE scheme 58
Figure 4.5 The worst case of fault detection in RSAFE scheme 59
Figure 4.6 Diagram for inter-subnet fault detection in RSAFE scheme 60
Figure 4.7 An example of node K disappearance from the network 65
Figure 4.8 An example of link failure between two switches 66
Trang 10vi
topology 68 Figure 4.10 A simulation of 64 nodes divided into 8 subnets under a ring
topology 69 Figure 4.11 Experimental results of 64 nodes in multiple subnets of a star
topology 71 Figure 4.12 Experimental results of 128 nodes in multiple subnets of a star
topology 72 Figure 4.13 Experimental results of 256 nodes in multiple subnets of a star
topology 73 Figure 4.14 Experimental results of 512 nodes in multiple subnets of a star
topology 74 Figure 4.15 Experimental results of 1024 nodes in multiple subnets of a star
topology 75 Figure 4.16 Experimental results of 64 nodes in multiple subnets of a ring
topology 77 Figure 4.17 Experimental results of 128 nodes in multiple subnets of a ring
topology 78 Figure 4.18 Experimental results of 256 nodes in multiple subnets of a ring
topology 79 Figure 4.19 Experimental results of 512 nodes in multiple subnets of a ring
topology 80
Trang 11vii
Figure 4.20 Experimental results of 1024 nodes in multiple subnets of a ring
topology 81
Trang 12
viii
Table of Tables
Table 3.1 Descriptions of state transitions shown in Figure 3.10 40
Table 3.2 Definition of an entry conn[K] described in Figure 3.11 43
Table 3.3 Experimental results of failure switchover delay for the SAFE scheme 46
Table 4.1 Experimental parameters in the RSAFE scheme 67
Table 4.2 Experimental results for star topologies in the RSAFE scheme 76
Table 4.3 Experimental results for ring topologies in the RSAFE scheme 82
Trang 13ix
Novel Software-based Fault-Tolerant Schemes for Large-scale
Ethernet Networks
Hoang-Anh Pham Department of Information and Communications Engineering
Graduate School, Myongji University Directed by Professor Jong Myung Rhee
By incorporating fault-tolerant Ethernet (FTE), the reliability and availability of an Ethernet networked system will be greatly enhanced FTE aims
to handle the data flowing in the network continuously and to minimize the communication interruption even though there are faults in networks With the rapid growth of the network-based applications, the amount of information will proportionally increase according to the network size Therefore, the fault tolerance is more challenging and important in large-scale networks such as the naval combat system data network (CSDN) Various research studies have been conducted and numerous developments and standardization efforts have been made with the aim of enhancing the fault tolerance in Ethernet-based networks However, few studies on FTE for large-scale networks have been reported
This dissertation presents two novel software-based FTE schemes, scalable autonomous fault-tolerant Ethernet (SAFE) and rapid SAFE (RSAFE) schemes for large-scale networks The innovative idea of the SAFE scheme is to adopt the multiple subnets architecture for solving the scalability issue for large-scale networks while maintaining the fault detection performance in terms of failure
Trang 14x
switchover delay compared to the conventional heartbeat-based mechanism The analytical derivations show two key advantages of the SAFE scheme in reducing the number of heartbeat messages generated and in enhancing the network capability to overcome multiple points of failure The SAFE scheme is implemented at kernel-level to validate the correct fault detection mechanism The experimental results also show the better performance of the SAFE scheme in terms of failure switchover delay
The RSAFE scheme, an upgraded version of the SAFE scheme, inherits the multiple subnets architecture from the SAFE scheme However, a novel algorithm for fast fault-detection is adopted in order to significantly reduce the failure switchover delay The improved performance of the RASFE scheme over the SAFE scheme is validated by the analytical derivations as well as the simulations using OMNeT++ with various scenarios for in large-scale networks with hundreds or thousands of nodes
Trang 15- 1 -
Chapter 1 Introduction
1.1 Fault-tolerance Ethernet (FTE)
In addition to high–performance and low–power characteristics usually required in most computing network systems, reliability is essential in some mission critical applications such as unmanned vehicles, military weapon systems, process control networks, and aviation equipment So, apart from the demands on high-performance and low-power, the problem here is how to increase system reliability The reliability of a system is defined as an inverse to the failure rate of the system That means decreasing the failure rate improves reliability, but in reality it is quite difficult to build a system of no failure Therefore, the solution involves trying to keep the system working even when failure occurs In other words, reliability refers to the fault tolerance capability of the system Hence, fault tolerance means that there may some faults in some parts of a system, but the system itself continues to run its services
For fault tolerance in a computer-based system, standby hardware such as
an extra power supply and extra storage device service its clients might be added
So in case of fault the system separates the faulty device from the rest of the system and at the same time configures itself so that the desired operations continue to work [4, 23, 24] However, the fault tolerance is more challenging in a mission-critical network-based system which consists of a huge amount of end-node users such as computers, or combination of many systems In such networked environment, it is more difficult to manage network faults since they
Trang 161.2 Media Redundancy
Most data communications applications, for example, telephone conversations and credit card transactions, assume the availability of a reliable network At this level, data are expected to traverse the network and to arrive intact at their destination On the other hand, the physical systems that compose a network, inter-connections between end-nodes and switches, are subjected to component failures and link failures Similarly, the software that supports the high-level semantic interface often contains unknown bugs and other latent reliability problems Redundancy mechanism underlies all approaches to fault tolerance
The general idea of media redundancy and redundant paths is almost considered as old as the use of Ethernet for networked communications However, fault-tolerance, which necessitates the use of redundant structures, is a vital basic requirement of most networked systems
Media redundancy is primarily used to avoid single points of failure in industrial communications networks Wherever there is a single point of failure it
is possible for the communications network, for instance in an automated
Trang 17- 3 -
production line, to be completely disabled by a single technical fault The consequences of such a failure can potentially demand high cost If redundant structures are used then a single failure merely causes the network to fallback to a degraded state Communications via the network remain viable, and the redundant system makes it possible for a repair to be carried out to restore the previous fault-free state [4, 10, 23, 24, 62]
Basically, there are two levels of media redundancies, network-level and device-level redundancies In network-level redundancy, each end-node has only single network interface interconnecting together via network components such as switches However, multiple network components are interconnected in a topology that provides multiple paths between any two end-nodes in the network
as an example shown in Figure 1.1 At this level, various fault-tolerant schemes can be applied at the network components depending on a specific topology such
as tree, ring, or mesh in order to protect against the failures of networks components such as switches or links between switches
In device-level redundancy, each end-node in the network has multiple network interfaces (ports) connecting to the networks, which also provides redundant paths between any two nodes in the network as shown in Figure 1.2 With this level of redundancy, the fault-tolerant schemes can be employed at the end-nodes
Trang 18- 4 -
In order to fully utilize the media redundancies, both device-level redundancy and network-level redundancy are combined into redundant structures For example, Figure 1.3(a) represents a two independent networks and Figure 1.3(b) represents a single network with redundant paths
Trang 19- 5 -
1.3 Conventional FTE Approaches
Depending on the specific level of redundancy, various conventional FTE schemes are available These conventional schemes can be classified into two approaches; the software-based approach [1-2, 25-41, 53-58] and the hardware-based
application, and network architecture However, the difference between these two conventional approaches is mainly due to the methodology for solving two essential problems in FTE design: (1) fault detection that describes a mechanism
to detect the network fault, and (2) failure switchover process that performs a mechanism to find alternative paths to overcome the current failed path
For hardware-based approach, two network interfaces are accommodated
in a single device called a redundant network interface card (RNIC) Firmware employed inside the RNIC detects the failure by monitoring the signals and
Trang 20- 6 -
incoming packets in its network interfaces, and it is not concerned with the status
of the other nodes or the network components The hardware-based FTE implementation is primarily used at the device-level redundancy as shown Figure 1.2 The key advantage of this hardware-based approach is that the failure switchover delay is very short; it is at most on the order of hundreds milliseconds
or less Currently, there are several commercial products such as RAMiX, Inter Pro/100+, and HP APA However, this approach requires proprietary NIC development Thus it is not appropriate for such system that needs FTE modification in order to fulfill a particular requirement or to upgrade which usually incurs a high cost
For software-based approach, multiple standard single-port NICs are often used to provide redundant communication paths The FTE software called redundancy protocol performs fault detection and fault recovery This software-based approach is advantageous for using commercial-off-the-shelf (COST) NICs without doing any modifications to the hardware and software drivers The software-based approach has a wide range in employing various fault-tolerant schemes to various redundant network architectures
1.4 Dissertation Motivation
Designing any system to tolerate faults first requires the selection of a fault model from a set of possible failure scenarios along with an understanding of the frequency, duration, and impact of each scenario [11, 67] Most reliable network designs address the failure of any single component, and only few designs tolerate multiple failures Furthermore, there is neither perfect network topology nor
Trang 21- 7 -
perfect protocol that precisely covers all practical applications and requirements The right choice of topology and approach will always depend on additional factors such as physical installation requirement, the scope of network topology, target application, cost, and requirement of failure switchover delay
This dissertation aims to contribute two novel software-based FTE schemes, scalable autonomous fault-tolerant Ethernet (SAFE) and rapid SAFE (RSAFE), in the fields of network fault tolerance and related topics The proposed FTE schemes have been developed in the single network structure with redundant paths as shown in Figure 1.3(b) The primary goal is to handle the data flowing in the network continuously and minimize the communication interruption between any two nodes even though there can be multiple points of failure in networks The conventional heartbeat-based mechanism is improved in order to provide fast fault detection
The proposed schemes are very efficient and applicable to mission-critical large-scale Ethernet-based networks such as CSDN in a naval vessel
1.5 Dissertation Outline
The remainder of this dissertation is organized in the following manner Chapter 2 presents a brief review of the currently available redundancy protocols and discusses the reasons why these existent approaches are not directly applicable to large-scale networks Then, the SAFE scheme is thoroughly presented and discussed in Chapter 3 including the methodology, characteristics, and performance evaluation The upgraded version of the SAFE scheme, RSAFE
Trang 22- 8 -
scheme, is presented in Chapter 4 where the detailed description of the novel algorithm for fast fault-detection is focused The simulations for validating the RSAFE scheme with OMNeT++ also presented in this chapter Finally, general conclusions and possible developments for future work are summarized and discussed in Chapter 5
Trang 23(1) Failure switchover delay determinism: in the event of a failure, the time that the protocol needs to switch from the primary logical path to a secondary alternative path and to restore communications must be predictable
(2) Installation requirements: If using the protocol and/or complying with required switchover time impose any constraints on the installation, for example the physical topology or the maximum number of useable network switches, then these must be clearly specified
Trang 24- 10 -
(3) The protocol must be based on a standardized method in order to guarantee the transparency and compatibility with the existent systems Among these requirements, the first requirement is the most constraining characteristic in FTE design A redundancy protocol can be used only where reliable and calculable figures are available to specify the absolute worst-case upper limit for failure switchover delay in the event of a failure This is the only way of ensuring that the network will fulfill the requirements of the application If
a redundancy protocol can switch over fast enough to enable the data traffic and application to continue operating without impairment, then its redundancy mechanism is transparent to the application functionality and the timing requirements are fulfilled
There have been several well-known redundancy protocols such as spanning tree protocol (STP), rapid spanning tree protocol (RSTP), media redundancy protocol (MRP), parallel redundancy protocol (PRP), and high available seamless redundancy (HSR) The following section presents a brief review of these protocols based on two characteristics: (1) the fault-tolerant functionality; and (2) the failure switchover delay
2.2 Existing Redundancy Protocols
2.2.1 Spanning Tree Protocol
To facilitate the use of redundant communications structures, the Institute
of Electrical and Electronics Engineers (IEEE) specified the spanning tree protocol (STP) For the first time, the STP enabled all Ethernet switches to employ an
Trang 25- 11 -
algorithm to facilitate interconnected network structures The STP allows a network to include redundant paths between any two nodes in networks However, the STP only activates one path between any two nodes in a given moment by creating a loop-free topology called spanning tree from the connections between Ethernet switches When a link in the current spanning tree fails, the STP automatically performs the algorithm to rebuild the alternative spanning tree and restore the communication The STP was standardized in the IEEE 802.1D – 1998 [70]
Trang 26- 12 -
This protocol also uses what are called bridge protocol data units (BPDUs)
to communicate between the switches One root bridge is defined as the root of the tree, and the optimal network paths are determined from there If the network is changed in any reason, for instance by the failure of a physical connection, this is reported to the network by means of topology change notification BPDUs The response to these BPDUs is to recalculate the tree, activate the appropriate
The failure switchover delay of the standard STP is in order of seconds Therefore, further protocols based on the underlying STP mechanisms were subsequently developed, and these were better tailored to the specific requirements, in particular with markedly reduced switchover delays, such as an optimized version called rapid spanning tree protocol (RSTP)
2.2.2 Rapid/Multiple Spanning Tree Protocol
The rapid spanning tree protocol (RSTP) was definitively described in the
topologies, and support a higher number of switches The RSTP achieves significantly improved failure switchover delay when compared to the standard STP The failure switchover delay can be in order of hundreds milliseconds or less depending on the specific network topology, the location of network failure, and several configurable parameters However, RSTP still does not guarantee deterministic failure behavior For this reason there have been a number of
Trang 27- 13 -
attempts to optimize RSTP by restricting it to ring topologies and using fixed predefined parameters
The multiple spanning tree protocol (MSTP) is an extension to RSTP and
virtual local area networks (VLANs), MSTP always operates within VLANs and therefore facilitates more flexible network structures, for instance in order to implement load balancing over a variety of VLANs and network paths
MSTP and RSTP are mutually compatible and can be used together in a single network structure
2.2.3 Media Redundancy Protocol
The media redundancy protocol (MRP) is described in the IEC 62439-2
addresses industrial applications to provide high-availability Ethernet networks
Figure 2.2 shows a ring network using the MRP where one node does functions as a media redundancy manager (MRM) that monitors and controls the ring to react to the network failures, and the remaining nodes are media redundancy clients (MRC)
In a non-failure state, the MRM active one ports and blocks the remaining one That means that the ring network is logically converted into a linear structure
as shown Figure 2.3 For fault detection, the MRM periodically sends based test frames from one port and waits for receiving its test frame at the remaining port When a MRC receives test frame from one port, it just forwards
Trang 28Ethernet 14 Ethernet
the test frame to the remaining port Therefore, a network failure will be determined if the MRM fails to receive its test frame Then the MRM will activate the standby port to rebuild the network to an alternative linear structure For an example shown in Figure 2.2, it is assumed that the link between MRC 3 and 4 fails Then, the ring will be reconfigured as Figure 2.4
Trang 29- 15 -
The failure switchover delay depends on the interval time for sending test frames and the ring size Firstly, the MRP was designed for ring networks with up
to 50 devices in order to guarantee fully deterministic switchover behavior Then, the faster failure switchover delay can be configured
Since the MRP is specified for ring networks, the MRP can only tolerate single point of failure That means if there are more than two links fault in a given period, the network will be interrupted
2.2.4 Parallel Redundancy Protocol
The parallel redundancy protocol (PRP) is described in the IEC 62439-3 standard [74] Even though the existent protocols such as RSTP and MRP can be practically modified to meet the critical requirements in terms of failure switchover delay However, there are several applications that cannot tolerate any
Trang 30The PRP is employed at the end devices, while the switches in the networks are standard switches with no knowledge of PRP An end-device with PRP functionality is called a double attached node for PRP (DANP) and is installed two network interfaces connecting to the two independent networks These two networks may have the identical structure or may differ in their topology and/or performance Figure 2.5 shows an example of two independent networks using the PRP of a star topology A DANP implementation controls the redundancy and deals with duplicates When the upper layers receive a packet for transmission, the PRP unit sends this frame to the network via both ports simultaneously When these two frames traverse the two independent networks they will normally be subject to different delays on their way to the recipient At their destination the PRP unit passes the first packet to arrive to the upper layers, i.e to the application, and discards the second one The interface to the application is thus identical to
Trang 31- 17 -
The most advantage of PRP is to provide interruption-free switchovers, which takes no time at all to switch over in failure situations and thus can offer the highest availability However, the PRP still cannot overcome the failure situations when multiple points of failure occur in a given period For an examples shown in Figure 2.6 where the link between DANP 1 and SWITCH A and the link between DANP 2 and SWITCH B fail at the same time Then, the communication between DANP 1 and 2 will be interrupted
Trang 32- 18 -
2.2.5 High-Availability Seamless Redundancy Protocol
High availability seamless redundancy (HSR) is a further development of the PRP approach and is also descried in the IEC 62439-3 standard Like the PRP, each end-node in the HSR is also installed two network interfaces and called double attached node (DANH) However, these DANHs connect their two network interfaces together in form of a ring topology as shown in Figure 2.7
In the HSR, each sending packet will be duplicated and transmitted on two separated physical paths, clockwise direction and counter-clockwise direction under a ring topology The corresponding receiver accepts the first arrival packet and discards the second one If a DANH is not the corresponding receiver, the packets arriving one interface will be forwarded to the remaining interface
Like the PRP, the most advantage of the HSR is to provide zero failure switchover delay even in the case of a link failure In order to detect link failures, each node periodically sends test frame to check to the ring state A network failure will be determined if a node does not receive its two test frames However, this checking is only meaningful in management and does not affect to fault tolerant functions Therefore, the interval time of sending test frame is not important like other protocols such as RSTP and MRP
Even though the HSR can be applicable in more complicated networks such
as couple rings or mesh networks by using RedBox or QuadBox components, the HSR is mostly adopted for ring networks Therefore, like the MRP, the HSR cannot suffer the multiple points of failure in a ring network as an example shown in
Trang 33- 19 -
Figure 2.8 in which the link between DANH 1 and 8, and the link between DANH
4 and 5 fail
Trang 34- 20 -
Furthermore, in the HSR, every sending packet will be duplicated and circulated inside the ring Therefore, the extra traffic of copied packets may degrade the network performance due to congestion and delay especially in case
of multicast and broadcast packets In order to prevent multicast and broadcast frames from circulating forever, the node that initially placed the multicast or broadcast frame on the ring will remove it as soon as it has completed one cycle However, the duplicate transmission of frames in both directions still means that effectively only 50% of the network bandwidth is available for data traffic Recently, there have been novel algorithms proposed in [21, 64-65] for solving the traffic issue in HSR
An HSR network always has the form of a ring, or a structure of coupled rings, which means that it is less flexible than the PRP at the installation stage However, unlike the PRP, the HSR does not require two parallel networks meanwhile maintaining zero failure switchover delay
2.3 Discussions
In terms of failure switchover delay, the above redundancy protocols are divided into two categories: (1) redundancy protocols with “non-zero” failure switchover delay including STP, RSTP, and MRP; and (2) “zero” failure switchover delay including PRP and HSR
Although the PRP and HSR provide interruption-free communication even there is a fault, these two protocols are efficiently designed for automation applications within control information The efficiency of these two protocols
Trang 35As abovementioned, there is neither perfect network topology nor perfect protocol that precisely covers all practical applications and requirements The right choice of topology and protocol will always depend on additional factors, such as the physical installation requirements and/or the switchover times demanded by the application
Hence, there have been also several research works to develop FTE schemes for their particular applications [52-58] The MCube scheme [57] proposed a server-centric network architecture specially designed for data centers The MCube architecture consists of a tree of routing and switching elements with more specialized and expensive equipment moving up the network hierarchy The
order to improve the performance in terms of bandwidth usage and to reduce the switching latency
Trang 362 can be (1) from NODE 1 via PORT A to NODE 2 via PORT A, (2) from NODE 1 via PORT A to NODE 2 via PORT B, (3) from NODE 1 via PORT B to NODE 2 via PORT A, and (4) from NODE 1 via PORT B to NODE 2 via PORT B
Trang 37- 23 -
A primary data path is defined as a selected path for primary communication between two nodes in a network The other path is on standby in case the primary data path has a fault In this dissertation, a network fault maybe caused by a link failure between a node and a switch, a link failure between two switches, NIC hardware failures at each node, or hardware failures at a switch These network faults are detected using conventional heartbeat-based mechanism
3.1.1 Conventional Heartbeat-based Mechanism
Each node periodically broadcasts a heartbeat message (HBM), an Ethernet frame, on all possible data paths to other nodes in the network for advertising its aliveness At the same time, each also receives and processes the HBMs sent by other nodes in the network Based on the conventional heartbeat-based mechanism, a node detects a network fault on a data path if it has not received any HBMs in two consecutive intervals from the path
When a network fault is detected in a primary path, the network communication will be failover switched by transmitting the outgoing data over
an alternative standby data path Figure 3.2 describes the conventional based algorithm, including heartbeat sending (HBM_TX) and heartbeat reception (HBM_RX)
heartbeat-As abovementioned, the failure switchover delay is one of the most constraining parameter that should be considered in network fault tolerance design Practically, the failure switchover delay can be determined in various ways, depending on the specific approach implemented In the conventional
Trang 38- 24 -
heartbeat-based mechanism, the failure switchover delay depends on the heartbeat repetition interval time In order to quickly detect network faults, the heartbeats must be sent more frequently which means that the number of generated HBMs significant increases Furthermore, in a large-scale network with hundreds or thousands of nodes, the increasing of the number of heartbeats will
be proportional to the number of nodes in the network Because a number of heartbeats consumes network bandwidth and disturbs normal data traffic, the network size cannot be increased as much as we want So the network size will be limited, which results in the network scalability problem Therefore such scalability problem should be solved when adopting the conventional heartbeat-based mechanism for large-scale networks
Trang 39- 25 -
3.1.2 Scalability Problem
The following discussion will help the readers to understand the scalability problem that limits the number of nodes in the networks corresponding to the requirement of the failure switchover delay
failure switchover delay Based on the conventional heartbeat-based mechanism, the failure switchover delay averagely takes twice of the heartbeat interval Theoretically, the failure switchover delay can be represented as follows:
processing latency Usually, the latency is much less than the heartbeat interval time
Again the conventional heartbeat-based mechanism has an inherent scalability problem because the heartbeats consume network bandwidth and limit
by which the system can handle the heartbeats generated in a heartbeat repetition
as follows:
Trang 40- 26 -
For example, when the system capability is 500 Hz and each node has two network interfaces, the maximum number of nodes is proportional to the heartbeat interval time, which is derived from (3.2) as follows:
A numerical analysis to investigate the number of nodes via the heartbeat interval time is depicted in Figure 3.3 Together with (3.3), it is implied that if the number of nodes increases, the heartbeat repetition interval time will increase; this results, however, in an increase the failure switchover delay that is determined by (3.1)