1.1 Survey of Outstanding Challenges in Multicast Communications 1 1.1.1 Lack of Multicast across Time 3 1.1.2 Limited Address Space & Large Delay in Application-level Multicast 5 1.1.3
Trang 1DATA-IN-NETWORK SCHEMES
GUO HUAQUN
NATIONAL UNIVERSITY OF SINGAPORE
2005
Trang 2DATA-IN-NETWORK SCHEMES
GUO HUAQUN
(B ENG., M Eng., NUS)
A THESIS SUBMITTED
FOR THE DEGREE OF PHD OF ENGINEERING
DEPARTMENT OF ELECTRICAL & COMPUTER ENGINEERING
NATIONAL UNIVERSITY OF SINGAPORE
2005
Trang 3To the memory of my mother Guo Lanying
Trang 4The author would also like to thank the Institute for Infocomm Research (I2R) for offering the research scholarship Thanks are also given to management of I2R, SingAREN (Singapore Advanced Research & Education Network) and National University of Singapore for providing the opportunity for this research
Special thanks to Ms Yee Poh Cheng, Dr Chai Teck Yoong, Dr Cheng Heng Seng, Ms Tan Joo Geok, Ms Vasudha Ramnath, Dr Dong Ligang in Data-In-Network research group for their help and creating a supportive research environment Not forgetting Ms Yang Ping and Ms Nazreen Beevi d/o Saifuddin for their help in collecting data
Last but not least, the heartiest gratitude is given to the author’s family for their love and encouragement
Trang 51.1 Survey of Outstanding Challenges in Multicast Communications 1
1.1.1 Lack of Multicast across Time 3 1.1.2 Limited Address Space & Large Delay in
Application-level Multicast 5 1.1.3 Scalability of IP Inter-domain Multicast 7
Trang 6CHAPTER 2: VIN: A SCALABLE VIDEO DISTRIBUTION SERVICE 19
2.2 Related Works of Networked Video Services 20
2.2.2 Parallel-Server Model 22 2.2.3 Multicast-based Video-On-Demand Services 22 2.2.4 Content Distribution Network 23
CHAPTER 3: COMPARISON OF VIN WITH STAGGERED MULTICAST 53
3.1 Effect of Processor Number on VIN 53 3.1.1 Effect of Core Network Bandwidth on
Number of Concurrent Streams 58 3.1.2 Effect of DIN Node Outgoing Link Bandwidth on
Number of Concurrent Streams 59
Trang 73.1.3 Effect of Cycle Time & Core Network Access Time on
Number of Concurrent Streams 60 3.1.4 Effect of Packet Set Size on Number of Concurrent Streams 60
3.3 VIN versus Staggered Multicast 65
3.3.2 Scalability Analysis 67 3.3.3 Startup Latency Analysis 69
4.2.2 Selection of DIN Nodes 79 4.2.3 Automatic Formation of Loop 81
Trang 84.3.4 Performance Improvement of Retrieving Messages
4.4.1 Effect of Bandwidth 102 4.4.2 Delay Analysis in Shared-tree Multicast 108
CHAPTER 5: DINPEER: OPTIMIZING APPLICATION-LEVEL MULTICAST
Trang 95.3.3 Performance of Divisible Load Computing using DINPeer 157 5.3.4 Effects of DINPeer in Synchronous and Asynchronous
CHAPTER 6: (G)MPLS-BASED DINLOOP_NET: IMPROVING
6.6.1 Message Load Evaluation 180 6.6.2 Inter-domain Router Forwarding State Size Evaluation 183 6.6.3 Inter-domain Multicast Delay Evaluation 186
6.8 Implementing DINloop_Net with GMPLS 188
Trang 106.8.1 GMPLS Label 189 6.8.2 Creating DINloop_Net with GMPLS 191
Trang 11SUMMARY
Multicast communications is still an active research field as there remain a number of challenges First, a fundamental problem is that current multicast communications work only across space, not across time Second, a major problem is that the IP multicast address space is too small Although application-level multicast has been proposed as an alternative form of group communication, application-level multicast suffers some other disadvantages in terms of high multicast delay, overloading at Rendezvous Point (RP) and single point of failure Finally, a nagging problem with IP multicast is that IP inter-domain multicast routing tables are large
In this thesis, we propose a new Data-In-Network (DIN) technology to overcome the above outstanding technical challenges The general idea of DIN is to configure multiple DIN Nodes to form a special logical loop and let network data circulate continuously in this loop In order to improve multicast communications in both network level and application level, we further derive two basic mechanisms: DINloop_Net which is DIN applied at the network layer, and DINloop_App which is DIN applied at the application layer With these mechanisms, we develop specific solutions, namely Video-In-Network (VIN), DINCast, DINPeer, and (G)MPLS-based DINloop_Net to address the above problems with multicast communications respectively
VIN is a distribution model proposed to enable video distribution across space and time In the VIN system, video data is continuously circulated in a DINloop_Net VIN supports late-joining users with high dynamic multicast membership and still achieves data consistency In this way, DIN technology enhances basic IP multicast to allow it
to support distribution of video data across both time and space To investigate the
Trang 12effectiveness of our solution, we develop a mathematical model based on Queuing theory to demonstrate the effectiveness of DIN technology We further demonstrate that the VIN distribution model has the capability to serve more concurrent video streams with less startup latency compared with Staggered Multicast video-on-demand system
DINCast and DINPeer are presented to solve the shortcoming of current application-level multicast solutions and avoid the problem of limited IP multicast address space DINCast uses the DINloop_App instead of the RP as multicast sources
to optimize application-level shared-tree multicast DINCast is able to reduce the multicast delay and multiple DIN Nodes also avoid overloading and single-point of failure
DINPeer extends DINCast by optimizing application-level multicast based on Peer-to-Peer (P2P) overlay network instead of a multicast tree and exploits a spiral-ring method to discover an inner ring with the relative largest bandwidth to form a DINloop_App DINPeer integrates the DINloop_App with a P2P overlay network to construct topologically-aware overlay multicast
Finally, (G)MPLS-based DINloop_Net with techniques of MPLS/GMPLS allows implementing DINloop_Net and optimizing inter-domain multicast We implement the DINloop_Net in the core network by establishing Label Switched Paths through the DIN Nodes The DINloop_Net is used to exchange multicast group membership information and facilitate forwarding of the multicast traffic The results show that the multicast routing table size in core routers does not increase as the number of multicast groups increases, and therefore, (G)MPLS-based DINloop_Net solves one problem of large IP inter-domain multicast routing tables
Trang 13NOMENCLATURE
List of Abbreviations
BGMP Border Gateway Multicast Protocol
CBT Core Based Tree
CDN Content distribution network
CMP Caching Multicast Protocol
CR-LDP Constraint-based Label Distribution Protocol
DIN Data-In-Network
DTV Digital TV
DVMRP Distance Vector Multicast Routing Protocol
DWDM Dense Wavelength Division Multiplexing
E-O Electrical domain to optical domain
ERM Edge Router Multicasting
FEC Forwarding Equivalence Class
GMPLS Generalized MultiProtocol Label Switching
GT-ITM Georgia Tech Internetwork Topology Models
IETF Internet Engineering Task Force
ILM Incoming Label Map
IP Internet protocol
IS-IS Intermediate System-to-Intermediate System
ISPs Internet Service Providers
LAN Local Area Network
LMP Link Management Protocol
Trang 14LSC Lambda-Switch Capable
LSP Label Switched Path
LSR Label Switching Router
MAN Metropolitan Area Network
MASC Multicast Address-Set Claim
MMT MPLS Multicast Tree
MOSPF Multicast Open Shortest Path First
MPLS MultiProtocol Label Switching
MSDP Multicast Source Discovery Protocol
NHLFE Next Hop Label Forwarding Entry
NIMS Network Information Manager System
O-E Optical domain to Electrical domain
O-E-O Optical domain to Electrical domain to Optical domain
OSPF Open Shortest Path First
P2P Peer-to-Peer
P2MP point-to-multipoint
PIM-DM Protocol Independent Multicast – Dense Mode
PIM-SM Protocol Independent Multicast – Sparse Mode
QoS Quality of Service
RAD Ratio of Average Delay
RLU Recent Least Use
RMD Ratio of Maximum Delay
RP Rendezvous Point
RSVP-TE Resource Reservation Protocol - Traffic Extension
RTT Round Trip Time
Trang 15SA Source Active
SGB Stanford Graph Base
SingAREN Singapore Advanced Research & Education Network
TCO Total Cost of Ownership
TE Traffic Extension
TTL Time-To-Live at the application level
VIN Video-In-Network
VOD Video-On-Demand
WAN Wide Area Network
WDD Wavelength Disk Drives
Notations
B Transmission speed of each host
C v Storage capacity of the core network in VIN
D Cycle time, i.e., same video content appears repeatedly in the core optical
network after each cycle time
floor(m) Give the largest integer that is less than or equal to m
fmod(m,n) Give the remainder of m/n
f mp A factor of the propagation delay over the delay at a node with Function A
F Value of the DINloop_Net overhead (similar to disk overhead) over the
fetching data size
g Number of adjacent nodes in the outer ring to be considered for an inner
link
Trang 16h Number of multicast messages If each node sends a multicast message, it is
also equal to the number of nodes
H Number of idle hosts
integer(m) Obtain the integer portion of m
j Number of child-nodes that parent/grandparent of DIN Node needs to
forward the multicast message in DINCast
k Number of other nodes linked to DIN Nodes or parents/grandparents of
DIN Nodes at the same level in DINCast
K Number of iterations in spiral-ring method
l loop Value of hop depth in DINCast, where the DINloop_App is formed
L The value of multicast tree levels
L SM Maximum startup latency in Staggered Multicast
L VIN The worst case startup latency of VIN
M Number of load fractions or Number of messages
n Number of concurrent video streams If each client requests a stream, it is
also equal to the number of clients
n 1 Number of branches for non-leaf nodes in a shared tree
N Number of DIN Nodes
N c Number of channels dedicated to a video in Staggered Multicast
N h Number of hosts
Number of parents/grandparents of DIN Node at Level l
N v Number of videos that the core optical network of VIN can store
N wl Number of wavelengths in core network
S Size of a load fraction or size of a message
l
N
Trang 17S p Packet set size that a DIN Node picks up for a certain video stream each
round
S pm Multicast message size
t c Demux/mux delay at all DIN Nodes that is the sum of time needed from the
In-interface of demultiplex to the Out-interface of multiplex in every DIN Node when the traffic in the DINloop_Net passes through every DIN Node
t fd Filtering delay, i.e., when filtering data from wavelength cannot be
processed at line speed, there is some additional filtering delay It is the difference between the filtering data time and the data processing time at line speed
t mp Propagation delay per hop in application-level multicast
t p Propagation delay in core optical network determined by the length of the
core optical network
t s Switching time of wavelength needed by the tuneable optical filter when a
DIN Node switches/tunes to a particular wavelength that contains the required data It is from the start of switching/tuning to the end of switching/tuning
t sl Signalling time from a DIN Node receiving a video request to the DIN
Node deciding to switch/tune to a particular wavelength
T Total delay time in VIN system
T Dmd_i Multicast delay for Node i in DINCast, it equals to the delay from the
source to Node i via DINloop_App
T Dmd_i1 Delay from multicast source to a DIN Node in DINCast
T Dmd_i2 Delay in the DINloop_App in DINCast
T Dmd_i3 Delay from the DIN Node to Node i in DINCast
Trang 18T Dmd_total Total multicast delay for all nodes to receive the multicast message in
DINCast
T Link Link latency
T mA Delay time for a multicast message at a node with Function A
T mB Delay time for a multicast message at a node with Function B
T mB_RP Delay time for a multicast message at the RP with Function B
T mC Delay time for a multicast message at the rest of DIN Nodes with Function
T s Core network access time from a DIN Node receiving a video request to the
DIN Node filters the required video data from core network including the
signalling time t sl , the switching time t s and filtering delay t fd
T sm Time units that Channel i begins later than Channel i-1 in Staggered
Multicast
T t Average client time in the VIN system
T tm Average multicast message time in the system
T Tmd_i1 The delay from multicast source to the RP in shared-tree multicast
T Tmd_i2 The delay from the RP to Node i in shared-tree multicast
T Tmd_total Total multicast delay for all nodes to receive the multicast message in
shared-tree multicast
T v Length of a video
V BW DIN Node common outgoing link bandwidth
Trang 19V I Core network bandwidth
V mA_i Incoming bandwidth/outgoing bandwidth at a node (i.e., Node i) with
V mC_d Incoming bandwidth/outgoing bandwidth at DIN Node d with Function C
V mD_f Incoming bandwidth/outgoing bandwidth at a parent/grandparent of DIN
Node (i.e., Node f) with Function D
V nd Incoming bandwidth/outgoing bandwidth at a node in multicast scheme
V o Video play back rate – the bit rate to playback video, for example, MPEG-1
for digital video at about 1.5Mbps
V RP Incoming bandwidth/outgoing bandwidth at the RP
W Expected packet set waiting time in queue in VIN system
W i Waiting time in queue of i th packet set in VIN system
W m Expected multicast message waiting time in queue at the RP or DIN Node
Trang 20LIST OF FIGURES
Figure 1.1 DINloop_Net in the network layer 13 Figure 1.2 An example of DINloop_App at the application layer 13
Figure 2.1 Client/Server video service model 21 Figure 2.2 Video-In-Network distribution service 24 Figure 2.3 A Preliminary Components Design of the DIN Node 26 Figure 2.4 Calculation of the waiting time 33 Figure 2.5 Number of MPEG-1 video streams against the core network bandwidth 39
Figure 2.6 Number of MPEG-1 video streams against V BW 41
Figure 2.7 Number of MPEG-1 video streams against (D+T s) 43 Figure 2.8 Number of MPEG-1 video streams against the packet set size 45
Figure 2.9 Concurrent stream number n against V I & V BW for F1 46
Figure 2.10 Concurrent stream number n against V I & V BW for F2 46 Figure 2.11 Concurrent stream number n against V I & V BW for F3 47
Figure 2.12 Concurrent stream number n against V I & V BW for F4 47
Figure 2.13 Concurrent stream number n against V I & V BW for F5 48
Figure 2.14 Concurrent stream number n against V I & V BW for F6 48
Figure 2.15 Concurrent stream number n against (D+T s ) & S p 50 Figure 3.1 Parallel relationship between retrieval and transmission 54 Figure 3.2 Comparison of retrieval time and transmission time 57 Figure 3.3 Performance comparison when core network bandwidth increases 58 Figure 3.4 Performance comparison when outgoing link bandwidth increases 59
Figure 3.5 Performance comparison when (D+T s) increases 60
Trang 21Figure 3.6 Performance comparison when the packet set size increases 61
Figure 3.8 Staggered Multicast video server 63
Figure 3.10 Concurrent stream number against (D+T s) in the VIN system &
comparison with 880 streams in a Staggered Multicast server 68 Figure 3.11 Startup latency comparison 71 Figure 4.1 Shared-tree multicast 74
Figure 4.3 Comparison of DINCast vs application-level shared-tree multicast 75
Figure 4.5 Automatic formation of loop 82 Figure 4.6 Optimization of loop 82
Figure 4.9 Effect of number of DIN Nodes on maximum & average delays 90 Figure 4.10 Effect of direct link probability & number of DIN Nodes
on maximum & average delays 91 Figure 4.11 Effect of total node number on maximim & average delays 92 Figure 4.12 Effect of total nodes number & number of DIN Nodes
on maximum & average delays 93 Figure 4.13 Effect of probability for direct link on maximum & average delays 94 Figure 4.14 Delay ratio of getting messages directly from DINloop_App
Figure 4.15 Performance improvement in retrieving messages directly
Figure 4.16 Multicast delay in DINCast & shared-tree multicast 101
Trang 22Figure 4.18 Function A at the RP 103
Figure 4.20 Function C at the first DIN Node 105 Figure 4.21 Function C at the rest of DIN Nodes 106 Figure 4.22 Function D at a parent/grandparent node of DIN Node 107
Figure 4.24 Multicast traffic in shared-tree multicast 109 Figure 4.25 Multicast traffic in Scheme A 110
Figure 4.26 Effect of ρ on delay ratios 116
Figure 4.27 Effect of f mp on delay ratios 118
Figure 4.28 Effect of h on delay ratios 121 Figure 4.29 Multicast traffic in Scheme B 124 Figure 4.30 Multicast traffic in Scheme C 129 Figure 4.31 Delay ratio values at each optimal place for three schemes 134 Figure 5.1 Illustration of DINloop_App in the application level 139 Figure 5.2 Nielsen’s Law of Internet access bandwidth 140 Figure 5.3 Spiral-ring method 143
Figure 5.5 Outer and inner ring in Multi-ring method 145 Figure 5.6 DIN sub-nodes with associated DIN Nodes 147
Figure 5.7 Effects of parameter g & percentage of links with high bandwidth 148 Figure 5.8 An example of node state 150 Figure 5.9 Delay ratio of message retrieval using DINPeer over
Figure 5.10 Delay ratio of message retrieval from DINloop_App over
Trang 23Figure 5.11 Ratio of the time to receive first bit in DINPeer over Unicast 158 Figure 5.12 Ratio of downloading time in DINPeer over Unicast 159 Figure 5.13 Ratio of completion time in DINPeer over Unicast
against processing host number 160 Figure 5.14 Ratio of completion time in DINPeer over Unicast
against host processing speed 160 Figure 5.15 Comparison between DINPeer and non-DINPeer
against host transmission speed & link latency 163 Figure 5.16 Comparison between DINPeer and non-DINPeer
against message size & message number 163 Figure 5.17 Comparison between DINPeer and non-DINPeer
against host number & DIN Node number 164 Figure 6.1 Control modules in DIN Node 171 Figure 6.2 One example of DINloop_Net in core network 172
Figure 6.4 Retrieve multicast message 176 Figure 6.5 Steiner tree formation method 177
Figure 6.7 Message load comparison against node number 181 Figure 6.8 Message load comparison against source number 183 Figure 6.9 Routing table size in non-DIN scheme 184 Figure 6.10 Routing table size in Solution 1 184 Figure 6.11 Routing table size in Solution 2 185 Figure 6.12 Delay ratio comparisons against domain number 187 Figure 6.13 Control modules in DIN Node with GMPLS 189 Figure 6.14 Generalized Label Request 190 Figure 6.15 LSP creation with GMPLS for DINloop_Net 192
Trang 24LIST OF TABLES
Table 1.1 Comparison of DIN, WDD & OWCache 12
Table 2.1 F values used in analysis 38 Table 2.2 Increasing rate with core network bandwidth 40 Table 2.3 Increasing rate with the outgoing link bandwidth 42
Table 2.4 Decreasing rate with (D+T s ) 43
Table 3.1 VIN versus Staggered Multicast 67 Table 3.2 Increasing rate of VIN versus Staggered Multicast 69
Table 4.3 Pseudo code for loop formation algorithm 84
Table 4.6 Decision table for the RP 97 Table 4.7 Optimal place in Scheme A for binary multicast tree 122 Table 4.8 Optimal place in Scheme A for multicast tree with 3 branches 123 Table 4.9 Optimal place in Scheme B for binary multicast tree 128 Table 4.10 Optimal place in Scheme B for multicast tree with 3 branches 128 Table 4.11 Optimal place in Scheme C for binary multicast tree 133 Table 4.12 Optimal place in Scheme C for multicast tree with 3 branches 133
Trang 25CHAPTER 1
INTRODUCTION
1.1 Survey of Outstanding Challenges in Multicast Communications
Data communication in the Internet can be performed by any of the following mechanisms: unicast, broadcast, and multicast Unicast is point-to-point communication that takes place over a network between a single sender and a single receiver Broadcast is when data is forwarded to all the hosts in the network simultaneously Multicast, on the other hand, is when data is to be transferred to only a selected group of hosts simultaneously using the most efficient strategy to deliver the data over each link of the network
In the age of multimedia and high-speed networks, there are many applications of sending information to a selective, usually large, number of clients Common examples
of such applications include audio/video conferencing, distance learning, demand, distributed interactive games, data distribution (file, software, multimedia documents, stock prices), service location/discovery, collaborative computing, collaborative visualization, distributed simulation, communicating to a dynamic group, and so on [1, 2] To support such applications, multicast is considered as a very efficient mechanism [3] since it uses some delivery structures to forward data from senders to receivers, with the aim that the overall utilization of resources in the underlying network is minimized [4] For example, multicast is heavily used for mass media TV distribution which can be seen from a survey conducted by NAB Research and Planning [5] NAB Research and Planning conducted this survey in July 2005 of all U.S commercial television stations on their plans for DTV (Digital TV) multicast
Trang 26video-on-services Among the 450 responding stations, 50% of stations are currently multicasting, and 79% among non-multicasting stations are considering multicasting at some point in the future
The first was proposed by Steve Deering in 1988 [6] and described as the standard multicast model for IP network, i.e., IP multicast [7] With IP Multicast, a single packet transmitted at the source is delivered to an arbitrary number of receivers by replicating the packet within the network routers along a multicast tree rooted at the traffic’s source or a router assigned as a Rendezvous Point (RP) The first experiment
of multicast took place during an “audiocast” at the 1992 Internet Engineering Task Force (IETF) meeting in San Diego [8, 9] From the first experiment in 1992 to the middle of 1997, standardization and deployment in multicast focused on a single flat topology This topology is in contrast to the Internet topology, which is based on a hierarchical routing structure The initial multicast protocol research and standardization efforts were aimed at developing routing protocols for this flat topology Beginning in 1997, when the multicast community realized the need for a hierarchical multicast infrastructure and inter-domain routing, the existing protocols were categorized as intra-domain protocols and work began on standardizing an inter-domain solution [9] Some important intra-domain multicast protocols are DVMRP (Distance Vector Multicast Routing Protocol) [10], PIM-DM (Protocol Independent Multicast – Dense Mode) [11], MOSPF (Multicast Open Shortest Path First) [12], PIM-SM (Protocol Independent Multicast – Sparse Mode) [13], and CBT (Core Based Tree) [14], while Multicast Source Discovery Protocol (MSDP) [15] and Border Gateway Multicast Protocol (BGMP) [16, 17] were developed to support inter-domain multicast In the last decade, although multicast protocol development and implementation has come a long way, its usage has not been as widespread as
Trang 27originally envisioned [18, 19] One major reason is that IP multicast still suffers from a number of technical challenges [20, 21, 22] The main trust of this thesis is to offer solutions to overcome these problems and thereby enhance multicast communications
1.1.1 Lack of Multicast across Time
The fundamental problem is that existing IP multicast works only across space, not across time, whereas increasingly there are applications on the Internet across both space and time [20] One reason is that the recipients of most types of distribution content (real time audio, video, etc.) may want to receive them at different times For example, when universities offer tele-teaching lectures to their students, the students may want to have the opportunity to join their lectures at their most convenient time Another reason is that even if content should ideally reach all recipients immediately (i.e., ``push'' content like email), this may not be the case in practise because not all recipients are ready to receive the multicast content all the time (mainly because they are not connected to the internet all the time) Basic IP multicast model, on the other hand, assumes that all recipients receive the content at the same time
To address this problem, in Yoid across-time distribution [20], a host would, for instance, join a multicast tree and start receiving data At some time, the host is expected to forward the data to one or two other hosts to garner the benefits of multicast distribution Those other hosts may connect to the sending host long after it has received the data Chaining approach [23] manages the disk buffers in the client machines as a huge network cache Each client is capable of caching the requested video and pipelining it to other clients in the downstream at a later time A performance limitation of Chaining is due to the fact that forwarded data must travel from one edge to another edge of the network To overcome the limitation with
Trang 28Chaining approach, Caching Multicast Protocol (CMP) [24] caches video in the routers
to provide future services to the local requests Range Multicast adapts the CMP communication paradigm for the Internet and caches video data in the overlay nodes so that these enable clients to join a multicast at their specified time and still see the entire video [25] In Range Multicast, as a video stream passes through a sequence of nodes
on the delivery path, each caches the video data into a fixed-size FIFO buffer As long
as the buffer is not full, it can be used to provide the entire video stream to subsequent clients requesting the same video However, techniques of caching or buffering in hosts/routers/overlay nodes have some drawbacks First, the cache or buffer size is limited so that it cannot buffer all data and therefore cannot meet all the requests for different data Second, the replication and forwarding approach for across time multicast is not suitable for dynamic data The data buffered in hosts/routers/overlay nodes may be out dated Third, it is hard to maintain consistency among multiple copies of data buffered in different hosts/routers/overlay nodes
There is another approach namely multicasting the content multiple times until all recipients have got it One example of multicast across the time is server-based staggered multicast video-on-demand [26, 27, 28, 29, 30, 31] In Staggered Multicast, one or several channels send the whole video or a part of it in cycles Several channels broadcast a video periodically with staggered start times However, the capacity constraints of the server hardware often limit the system to no more than a few hundred concurrent streams/channels [28, 32, 33, 34] Thus, it suffers from scalability limitation in terms of the concurrent video stream/channel number that the server can support simultaneously In addition, since each request must wait for the next multicast, they cannot offer true on-demand services The maximum startup latency
Trang 29equals the length of the video divided by the number of the channels allocated to this video Therefore, the startup latency can be rather large [26]
1.1.2 Limited Address Space & Large Delay in Application-level Multicast
The major problem referred here is that IP multicast address space is too small [35, 36] Currently multicast group addresses are Class D IP addresses and the address space for a multicasting group is limited to 28 bits with IPv4 Class-D IP addresses are the IP addresses 224.0.0.0 through 239.255.255.255 that are reserved for multicasts A concern that has spanned decades since the 1980s is the exhaustion of available IP addresses The most visible solution is to migrate to IPv6 since IPv6 extends this to
112 bits However, migration has proved to be a challenge in itself, and total internet adoption of IPv6 is unlikely to occur for many years [37] Thus, IPv6 will not likely become ubiquitous any time soon
Application-level multicast has been proposed as an alternative form of group communication [38, 39; 40, 41] Application-level multicast moves the multicast functionality to the application-level, by establishing a multicast network at the application level rather than handing it at the IP level Application-level multicast does not have IP multicast's address assignment problems because any host can create any number of globally unique group names
Recent works on structured P2P (Peer-to-Peer) overlay network offer scalability and robustness for advertisement and discovery of services Pastry [42], Chord [43], CAN [44] and Tapestry [45] represent typical P2P routing and location schemes Furthermore, there has been a number of works reported on overlay multicast, e.g., Scribe [46], CAN-Multicast [47], Bayeux [41], YOID [19], TBCP [48], HMTP [49], NICE [50], Overcast [51] and ZIGZAG [52] Each uses a different overlay network to
Trang 30implement application-level multicast, using either flooding (CAN-Multicast) or building (Scribe, Bayeux, YOID, TBCP, HMTP, NICE, Overcast and ZIGZAG) The flooding approach creates a separate overlay network per multicast group and broadcasts messages within the overlay The tree approach uses a single overlay and builds a tree topology first [53, 54] In [54], the results showed that the tree-based approach of Scribe consistently outperforms the flooding approach of CAN-Multicast Compared to IP multicast, application-level multicast has a number of advantages First, a major advantage is that most proposals do not require any special support from network routers, avoid the major problem of limited IP multicast address space and can therefore be deployed universally Second, the deployment of application-level multicast is easier than IP multicast Third, the P2P overlay network is fully decentralized
tree-However, application-level multicast also suffers three disadvantages First, due to the fact that application-level multicast is implemented at host-level and the underlying physical topology is hidden, even with topology-awareness [54], application-level multicast still increases the delay to deliver messages compared with IP multicast A node’s neighbours on the overlay network need not be topologically nearby on the underlying IP network This can lead to inefficient routing because every application-level hop could potentially be between two geographically distant nodes Second, in shared-tree multicast, the multicast tree is built at the application level and the RP is the root of the multicast tree Subsequently, every member that joins the multicast group becomes the nodes of the multicast tree A sender sends data to the RP The RP forwards the data along the multicast tree to all members The RP can potentially be subjected to overloading and single-point of failure For example, in Real-Time Conferencing Protocol, to provide a multicast back-channel for a group of receivers
Trang 31will result in serious source implosion problem [53] Third, routing in the P2P overlay network does not consider the load on the network It treats every peer as having the same power and the same bandwidth, and thus may cause the overloading or congestion in the links with low bandwidth
1.1.3 Scalability of IP Inter-domain Multicast
A nagging problem with IP multicast is large IP multicast routing tables The problem
of large IP multicast routing tables also exists in the inter-domain multicast
Multicast Source Discovery Protocol (MSDP) and Border Gateway Multicast Protocol (BGMP) were developed for inter-domain multicast However, MSDP requires each multicast router to maintain forwarding state for every multicast tree passing through it and the number of forwarding states grow with the number of groups [55, 56, 57], which severely limits routing scalability In addition, MSDP floods source information periodically to all other RPs on the Internet using TCP links between RPs [20] If there are thousands of multicast sources, the number of Source Active (SA) messages being flooded around the network would increase linearly Thus, the MSDP multicast protocol suffers from routing scalability and control overhead problems
BGMP scales better to large numbers of groups by allowing (*, G-prefix) and prefix, G-prefix) states to be stored at the routers where the list of targets are the same
(S-In order to achieve this, the Multicast Address-Set Claim (MASC) protocol must form the basis for a hierarchical address allocation architecture [16] MASC uses a listen and claim with collision detection approach This approach has two drawbacks: first, this approach is not supported by the present structure of non-hierarchical address allocation architecture, as MASC/BGMP is not ready yet [58]; and second, the
Trang 32claimers have to wait for a suitably long period to detect any collision, i.e., 48 hours [16], and hence it is not suitable for dynamic setup
In brief, the current infrastructure for global, Internet-wide multicast routing faces the routing scalability in terms of large IP multicast routing tables and control overhead problems This probably explains why there is still no wide-scale multicast
on the Internet [58]
Significant research efforts have focused on overcoming large IP multicast routing tables Some schemes attempt to reduce forwarding state by tunnelling [59] or by forwarding state aggregation [60] Both these works attempt to aggregate routing state after this has been allocated to groups Chu et al [38] use network-transparent multicast to completely eliminate multicast state at routers, and thus push complexity
to the end-points Other architectures aim to eliminate forwarding states at routers either completely by explicitly encoding the list of destinations in the data packets, instead of using a multicast address [61] or partially by using branching node routers in the multicast tree [62, 63] In aggregated multicast [55], multiple multicast groups share one aggregated tree to reduce forwarding state, and a centralized tree manager is introduced to handle aggregated tree management and matching between multicast groups and aggregated trees Aggregated multicast is targeted for intra-domain multicast and the centralized tree manager is a weakness since it can potentially be subjected to overloading and single-point of failure
1.1.4 Summary of Survey
Multicast is still an active research area [53] as there remains a number of challenges
We have identified three in this survey First, a fundamental problem is that IP multicast works only across space, not across time The replication and forwarding
Trang 33approach proposed across time multicast has a number of problems, such as, limited cache or buffer size, out-dated data and inconsistency among multiple copies of data buffered in different places Another example of multicast across time is server-based staggered multicast video-on-demand However, Staggered Multicast suffers from scalability limitation in terms of the number of concurrent video streams/channels, and rather large startup latency Second, a major problem is that IP multicast address space
is too small Application-level multicast has been proposed as an alternative form of group communication because application-level multicast does not have IP multicast's address assignment constraints However, application-level multicast increases the delay to deliver messages because the underlying physical topology is hidden, and in shared-tree multicast, the RP can potentially be subjected to overloading and single-point of failure Finally, the current inter-domain multicast faces one nagging problem
in the area of managing large IP inter-domain multicast routing tables
Trang 341.2.1 Background
As networks face increasing bandwidth demand, network providers are moving towards a crucial milestone in network evolution: optical network Optical networks are high-capacity telecommunications networks based on optical technologies and components that provide routing, grooming, and restoration at the wavelength level as well as wavelength-based services With recent advances in optical technologies such
as Dense Wavelength Division Multiplexing (DWDM) [64], it is now possible to greatly increase the bandwidth of an optical fibre by transmitting over a number of wavelengths in a single fibre In fact, over the last decade, transmission capacity has been doubling every nine months, twice as fast as the Moore's Law rate for semiconductor processing speed [65]
Current commercial optical systems can transmit up to 2Tbps and laboratory experiments have demonstrated transmission rates of 10Tbps [66] Scientists from Bell Labs have calculated the theoretical limits on the carrying capacity of a single optical fibre: 100Tbps [67] Therefore, it is likely that there will be enough bandwidth for normal communication in the near future Furthermore, so much bandwidth will not always be fully utilized In fact, a bandwidth glut already exists in the long-haul market [68] Thus, DIN technology is feasible due to the availability of large bandwidth of optical fibre and advances in the processing power of the CPU
The idea of storing data in networks has been adopted in several research prototype systems The fibre loop memory [69] was proposed to resolve packet contention problems, in which the contending packet circulates in the fibre loop until the optical network resources (e.g., receiver, switches, or wavelength) become available [69, 70]
As presented by Carrera and Bianchini [71], the optical ring was used to be a cache (called OWCache, optical write cache) of disks in the multiprocessor, so that the time
Trang 35of writing and reading data can be reduced All these systems store data within the system and they appear as a local cache, while DIN technology uses DINloop as a networked wide-area cache
More recently, CANARIE - Canada’s advanced Internet development organization announced a research project Wavelength Disk Drives (WDD) to use a multi-wavelength optical network as a disk drive [72] WDD attempts to store data in DWDM networks across Canada to facilitate a new content-based messaging distributed computing paradigm and the bottleneck of accessing servers can be avoided
All the above researches show the applicability of storing by circulating data in networks Our proposed DIN adopts the network structure similar to WDD However, DIN has some key differences from WDD and extends the scope of WDD by forming the DINloop at network layer and application layer respectively The detail comparison
of DIN, WDD and OWCache is shown in Table 1.1
1.2.2 DIN-based Solutions
From the general concept of DIN technology described above, we derive two basic mechanisms: DINloop_Net and DINloop_App, and further realize these into specific solutions: Video-In-Network (VIN), DINCast, DINPeer, and (G)MPLS-based DINloop_Net to address the three problems with multicast communications
- DINloop_Net: DIN in the Network Layer
DINloop_Net applies the DIN concept at the network layer where multiple DIN Nodes, which are core routers in a high-speed core network, are configured to form a special logical loop (thick arrow line in Figure 1.1) The direction of DINloop_Net can
Trang 36be clockwise or clockwise In Figure 1.1, the direction of DINloop_Net is clockwise as shown by the arrows
anti-Table 1.1 Comparison of DIN, WDD & OWCache
Circulate data in DWDM networks across Canada
- Circulate data on the core optical network as a networked wide-area cache
- Circulate high dynamic/important data
in the DIN at application layer
- Can be in the order of terabits at network layer depending on the length
& capacity of optical fibre
- In the order of Mbytes to Gigabytes at application layer
Topology Restrict to physical
- Can set up the logical loop - DINloop_App at application layer as well Technology for
data circulation
The fixed receiver and transmitter in
OWCache interface are used to re- circulate the data on writable-cache channel
Use UDP only at the application layer and transmit in WDM networks
- Circulate the optical signal in DINloop_Net at the network layer
- DINloop_App at application layer uses UDP
Data processing
at nodes
OWCache interface processes in electrical domain and traffic is converted from optical domain to electrical domain to optical domain (O-E-O)
Every node processes in electrical domain and traffic is converted from optical domain to electrical domain to optical domain (O-E-O)
- When the DINloop_Net
at network layer, the circulating function is processed always in the optical domain without O-E-O conversion Only read/write function is needed to do O-E-O conversion
- When the DINloop_App
at application layer, the data processing at nodes
is in electrical domain
Trang 37Figure 1.1 DINloop_Net in the network layer
- DINloop_App: DIN in the Application Layer
DINloop_App applies the DIN concept at the application layer where multiple DIN Nodes, which are hosts, form a special logical loop at the application layer An example of DINloop_App is shown in the solid arrow line in Figure 1.2
Figure 1.2 An example of DINloop_App at the application layer
In the following subsections, we briefly introduce each specific solution The details of our solutions are presented in the subsequent chapters in this thesis
: DIN Node : Other Core router/switch
DINloop
: Access network Intra-domain DINloop_Net
Trang 381.2.2.1 VIN: Video-In-Network
The Video-In-Network (VIN) distribution model is a new DIN-based video distribution model proposed using DINloop_Net to demonstrate IP multicast across both space and time In VIN system, only one copy of video data is continuously circulated in the DINloop_Net configured at core optical network In other words, VIN uses the DINloop_Net as a networked wide-area cache and utilizes the propagation delay to “buffer” data in the DINloop_Net Thus, VIN can provide real-time update and offline access at any time since the data in the DINloop_Net is persistent VIN supports late-joining users [73] with high dynamic multicast membership and still achieves data consistency In this way, DIN technology solves the fundamental problem of IP multicast across both time and space
As the video data circulates continuously in the network at the speed of light, the video data is available almost instantaneously to the interface that requires it So the VIN distribution model has the capability to serve more concurrent video streams with less startup latency compared to Staggered Multicast video-on-demand, the typical example of multicast across the time Details of VIN are provided in Chapter 2 and Chapter 3
1.2.2.2 DINCast
DINCast is a scheme using DINloop_App to optimize the application-level multicast based on a shared-tree [74, 75] In DINCast, the DINloop_App is implemented in the application level over an existing application-level shared multicast tree DINCast is shown to reduce multicast delay and the use of multiple DIN Nodes also avoids overloading and single-point of failure Details of DINCast are presented in Chapter 4
Trang 391.2.2.4 (G)MPLS-based DINloop_Net
(G)MPLS-based DINloop_Net implements the DINloop_Net using MPLS/GMPLS techniques at the core network and solves one nagging problem of large IP inter-domain multicast routing tables Details of (G)MPLS-based DINloop_Net are presented in Chapter 6
1.3 Summary & Research Contributions
Our proposed DIN technology and its specific solutions in solving existing technical challenges facing multicast are illustrated in Figure 1.3 The relationship among the various solutions is also shown
In this thesis, we have contributed to the multicast research in the following aspects:
(1) DIN technology as realized in VIN solves the fundamental problem of IP multicast across both time and space VIN supports highly dynamic data and highly dynamic multicast membership with late-joining users This solution is
Trang 40able to accommodate a large number of distribution applications that existing
IP multicast cannot handle and was never meant to handle We use multicast video-on-demand as an example and develop a mathematical model based on Queuing theory to demonstrate the effectiveness of DIN technology that results
in the new VIN service model [76, 77] The concurrent stream number of VIN can be twice that of Staggered Multicast and the startup latency of VIN can be one third of startup latency of Staggered Multicast
Figure 1.3 DIN technology
(2) DINCast and DINPeer are proposed to solve the shortcomings of current application-level multicast solutions DINCast is able to reduce communications delay compared to the original shared-tree multicast and also
Enhance scheme to optimize application- level multicast using overlay network
DINCast (Chapter 4)
DINPeer (Chapter 5)
inter-VIN
(Chapters 2 & 3)
Optimize tree application- level multicast based on multicast tree
shared-Optimize application-level multicast based on overlay network