Therefore, the book describes the distributed systems along a line fromgeneral distributed system requirement of applications to system transparency thatreflect system structure and algo
Trang 2DISTRIBUTED NETWORK SYSTEMS
Trang 3Managing Editors:
Ding-Zhu Du
University of Minnesota, U.S.A.
Cauligi Raghavendra
University of Southern Califorina, U.S.A.
Network Theory and ApplicationsVolume 15
Trang 4DISTRIBUTED NETWORK SYSTEMS
From Concepts to Implementations
Trang 5eBook ISBN: 0-387-23840-9
Print ISBN: 0-387-23839-5
Print © 2005 Springer Science + Business Media, Inc.
All rights reserved
No part of this eBook may be reproduced or transmitted in any form or by any means, electronic, mechanical, recording, or otherwise, without written consent from the Publisher
Created in the United States of America
Boston
©200 5 Springer Science + Business Media, Inc.
Visit Springer's eBookstore at: http://ebooks.springerlink.com
and the Springer Global Website Online at: http://www.springeronline.com
Trang 61.2.3 Network Fault Tolerance
1.3 Protocols and QoS
1.4 Software for Distributed Computing
1.4.1 Traditional Client-Server Model
1.4.2 Web-Based Distributed Computing Models
1.4.3 Web-based Client-Server Computing
1.5 The Agent-Based Computing Models
1.6 Summary
Exercises
Chapter 2 Modelling for Distributed Network Systems: The
Client-Server Model
2.1 Issues Leading to the Client-Server Model
2.2 The Client-Server Model in a Distributed Computing System
2.2.1 Basic Concepts
2.2.2 Features and Problems of the Client-Server Model
2.3 Cooperation between Clients and Servers
2.3.1 Cooperation Type and Chained Server
2.3.2 Multiple Servers
2.4 Extensions to the Client-Server Model
2.4.1 Agents and Indirect Client-Server Cooperation
2.4.2 The Three-Tier Client-Server Architecture
2.5 Service Discovery
2.5.1 Hardwiring Computer Address
2.5.2 Broadcast Approach
2.5.3 Name Server Approach
2.5.4 Broker-Based Location Lookup
2.6 Client-Server Interoperability
2.7 The Relationship
2.8 Summary
1 2 2 3 4 5 6 6 7 9 10 12 13
15
15 16 16 17 18 18 19 20 20 22 24 25 25 26 27 28 29 30
Trang 73.2.2.1 Basic Message-Passing Primitives
3.2.2.2 Direct and Indirect Communication Ports
3.2.2.3 Blocking versus Non-blocking Primitives
3.2.2.4 Buffered versus Unbuffered Message Passing Primitives
3.2.2.5 Unreliable versus Reliable Primitives
3.2.3 Structured Forms of Message-Passing Based Communication
3.3 Remote Procedure Calls
3.3.1 Executing Remote Procedure Calls
3.3.2 Basic Features and Properties
3.3.3 Parameters and Results in RPCs
3.3.3.1 Representation of Parameters and Results
3.3.3.2 Marshalling Parameters and Results
3.3.4 Client Server Binding
3.4 Message Passing versus Remote Procedure Calls
3.5 Group Communication
3.5.1 Basic Concepts
3.5.1.1 Group Structures
3.5.1.2 Behaviour Classification of Process Groups
3.5.1.3 Closed and Open Groups
3.5.2 Group Membership Discovery and Operations
3.6 Distributed Shared Memory
3.6.1 What is a Distributed Shared Memory (DSM) System?
3.6.2 Design and Implementation Issues
3.6.3.1 Sequential Consistency Model
3.6.3.2 Weak Consistency Model
3.6.3.3 Release Consistency Model
3.6.3.4 Discussion
3.7 Summary
Exercises
Chapter 4 Internetworking
4.1 Communication Protocol Architectures
4.1.1 The OSI Protocol Architecture
4.1.2 Internet Architecture
33 34 34 36 36 37 38 40 42 44 44 44 46 47 47 48 48 50 51 51 52 53 53 53 55 55 57 57 58 58 58 59 60 60 60 61 61 62 63 64
65
65 65 68
Trang 8vii4.2 TCP/IP Protocol Suite
4.2.1 Communication Protocols
4.2.2 Network Layer Protocol: IP
4.2.2.1 IP Address
4.2.2.2 Domain Name System
4.2.3 Transport Layer Protocol: TCP and UDP
4.3 The Next Generation Internet Protocol: IPv6
5.1 Developing Distributed Applications Using Message Passing
5.1.1 Communication Services in Message Passing
5.1.1.1 Connection-Oriented and Connectionless Communications
5.3 Basic Socket System Calls
5.3.1 Some Special Functions
5.4.1 Using Stream Sockets: A Simple Example
5.4.2 Using Datagram Sockets: A Simple Example
5.5 Summary
Exercises
Chapter 6 TCP/UDP Communication in Java
6.1 Java Sockets
6.1.1 Java Net Package
6.1.2 The Socket Class
6.1.3 The ServerSocket Class
6.2 Building TCP Clients and Servers
6.2.1 Essential Components of Communication
6.2.2 Implementing a TCP Client Program
6.2.3 Implementing a TCP Server Program
6.3 Examples in Java
6.3.1 Exchange of Multiple Messages
6.3.2 Executing the Programs on Internet Hosts
69 70 71 71 73 73 75 75 76 77 77
79
79 79 79 80 81 81 82 83 83 84 85 86 87 90 90 91 92 94 94 98 102 102
105
105 105 106 107 109 109 109 111 112 112 115
Trang 96.3.3 Supporting Multiple Clients
6.4 A More Complex Example - A Java Messaging Program using TCP
6.4.1 The Design
6.4.2 The Implementation
6.4.3 The Programs
6.5 Datagram Communications in Java
6.5.1 Why Datagram Communication ?
6.5.2 Java Datagram-based Classes
6.6 Building UDP Servers and Clients
6.6.1 Sending and Receiving UDP Datagrams
6.6.2 Datagram Server
6.6.3 Datagram Client
6.7 Summary
Exercises
Chapter 7 Interprocess Communication using RPC
7.1 Distributed Computing Environment (DCE)
7.1.1 The Architecture of DCE
7.4.3 The SRPC System Architecture
7.4.3.1 The System Library
7.4.3.2 The Location Server
7.4.4 The Stub and Driver Generator
7.4.4.1 Syntax
7.4.4.2 Semantics
118 119 120 121 122 127 127 128 130 130 131 132 133 133
135
135 135 137 139 140 141 142 143 145 146 146 147 147 148 149 150 151 151 153 154 154 154 154 155 155 155 157 157 157 158 159 159 160
Trang 10ix7.4.5 Implementation
7.4.6 An Application Example
7.5 Remote Method Invocation (RMI)
7.5.1 RMI Architecture
7.5.2 RMI Implementation
7.5.3 Interfaces and Classes
7.6 An Interesting RMI Application
7.7 Summary
Exercises
Chapter 8 Group Communications
8.1 Introduction
8.2 Features of Group Communication
8.2.1 Message Delivery Semantics
8.2.2 Message Response Semantics
8.2.3 Message Ordering in Group Communication
8.3 Reliable Multicast Protocol
8.3.1 Reliable Multicast System
8.6 Total Ordered Multicast Protocol based on a Logical Ring
8.6.1 Achieving Total Ordering
8.6.2 Atomic Message Delivery
8.7.1 System Structure and Communication Assumptions
8.7.2 State Machine Approach for Implementing RMP
8.7.3 Message Packet and Control Information
175
175 176 177 177 178 180 180 181 182 182 185 185 186 190 190 192 194 194 195 196 196 198 198 199 200 201 202 203 205 207 209 209
213
213 213 216 217 218
Trang 119.2.1 Redundancy
9.2.2 Fault Avoidance Techniques
9.2.3 Fault Detection Techniques
9.2.4 Fault Tolerance Techniques
9.3 Software Fault Tolerance
9.3.1 Techniques for Software Fault-tolerance
9.5 Fault Tolerant Distributed Algorithms
9.5.1 Distributed Mutual Exclusion
9.5.2 Election Algorithms
9.5.3 Deadlock Detection and Prevention
9.5.3.1 Distributed Deadlock Detection
9.5.3.2 Distributed Deadlock Prevention
9.6 Replication and Reliability
9.7 Replication Schemes
9.7.1 Case Study 1: The Primary-Backup Scheme
9.7.2 Case Study 2: The Active Replication Scheme
9.7.3 Case Study 3: Two Particular Replication Schemes
9.8 The Primary-Peer Replication Scheme
9.8.1 Description of the Scheme
10.1.1 What is a Secure Network?
10.1.2 Integrity Mechanisms and Access Control
10.5 Distributed Denial of Service Attacks
10.5.1 Launching a DDoS Attack
218 220 220 221 224 225 227 227 230 231 232 233 233 236 236 237 239 240 242 243 245 247 249 249 251 251 253 253
255
255 255 256 256 256 259 259 260 260 261 261 261 262 263 263 264 264 265 265
Trang 12xi10.5.2 Evolution of DDoS Attacks
10.5.3 Classification of DDoS Attacks
10.5.4 Some Key Technical Methods of DDoS Tools
10.6 Passive Defense against DDoS Attacks
10.6.1 Passive Defense Cycle
10.6.2 Current Passive Defense Mechanisms
10.6.3 Detecting Mechanisms
10.6.4 Reacting Mechanisms
10.6.5 SYN Attacks and Its Countermeasures
10.6.6 Limitation of Passive Defense
10.7 Active Defense against DDoS Attacks
10.7.1 Active Defense Cycle
10.7.2 Objectives of Active Defense
10.7.3 Current Techniques Applicable in Active Defense
10.7.4 Comparison between Passive and Active Defense
10.7.5 Major Challenges of Active Defense
11.2 The Reactive System Model
11.2.1 The Generic Reactive System Architecture
11.2.3 Simple and Composite Entities
11.3 Group Communication Services
11.3.1 Ordering Constraints
11.3.2 Fault Tolerance in the Reactive System
11.3.3 Atomic Multicast Service
11.3.4 Membership Management
11.4 Implementation Issues
11.4.1 Multicast Datagram Communication
11.4.2 Stream-based Communication
11.4.3 Total Ordering Protocol
11.4.4 Multicasting Atomicity Protocol
295
295 296 296 297 298 299 299 299 300 301 301 302 303 304 305 305 306 307 308 310 311 311 311 312 313 313 314 315
Trang 1312.3.2 Generation 1 (Traditional Web): HTML, HTTP, CGI
12.3.3 Generation 2 (Faster and More interactive Web): JavaScript, side API
Server-12.3.4 Generation 3 (Java-based Web): Java, JDBC
12.3.4.1 JAVA and JDBC
12.3.4.2 Servlet
12.3.5 A New Generation: XML, Client/Mobile Agents/Server
12.3.5.1 XML-based WBDB
12.3.5.2 Mobile Agent Involved Architecture
12.3.6 Other Useful Techniques
12.6 Developing Web-Based Databases
12.6.1 The Java Database Connectivity (JDBC) Package
12.6.2 Steps for Developing Web-based Databases
12.6.2.1 Preparing the Database
12.6.2.2 Creating the Database Tables
12.6.2.3 Populating the Tables
12.6.2.4 Printing the Columns of Tables
12.6.2.5 Select Statements (one table)
325
325 328 329 329 331 331 332 332 334 335 335 336 337 337 338 339 339 340 340 341 342 342 343 344 344 346 346 347 348 348 348 350 353 354 355 366 367
369
369 371
Trang 14xiii13.3 Agent Advertisement and Solicitation
13.3.1 Foreign Agent and Home Agent
13.3.2 Mobile Node Considerations
13.3.3 Move Detection
13.3.4 Returning Home
13.4 Registration
13.4.1 Registration Overview
13.4.2 Responses to Registration Request and Authentication
13.4.3 Registration Related Message Format
13.4.3.1 Registration Request
13.4.3.2 Registration Reply
13.5 Mobile Routing (Tunnelling)
13.5.1 Packet Routing when Mobile Node is at Home
13.5.2 Packet Routing when Mobile Node is on a Foreign Link
13.5.2.1 Unicast Datagram Routing
13.5.2.2 Multicast Datagram Packets Routing
13.5.3 Mobile Routers and Networks
13.6 Case Study: Mobile Multicast using Anycasting
13.6.1 Problems with Mobile IP
13.6.2 Mobile Multicast Protocol (MMP)
Chapter 14 Distributed Network Systems: Case Studies
14.1 Distributed File Systems
14.1.1 What is a Distributed File System
14.1.2 A Distributed File System Example NFS
14.1.3 Processing User Calls
14.1.4 Exporting Files
14.1.5 The Role of RPC
14.1.6 Remarks
14.2 Network Operating Systems: Unix/Linux
14.2.1 UNIX System Concepts
14.2.1.1 The File System
14.2.1.2 Process Management
14.2.1.3 The Shell
14.2.2 The UNIX Processes
14.2.2.1 Process Address Spaces
14.2.2.2 Process Management System Calls
14.2.2.3 Process Context and Context-Switching
14.2.3 Linux as a UNIX Platform
407
407 407 408 409 410 411 412 412 412 412 413 413 414 414 414 415 416 417 417 418 418
Trang 1514.3.2 The CORBA Architecture
14.3.3 Interface Definition Language (IDL)
14.3.4 An Example of CORBA for Java
14.4 DCOM
14.4.1 COM and DCOM
14.4.2 DCOM Facilities and Services
15.1.1 Cluster Operating Systems
15.1.2 Reliable Server Clusters
15.2 Grid Computing
15.2.1 What is Grid Computing?
15.2.2 Background to the Grid
15.2.3 Grid Architectures and Infrastructures
15.2.3.1 Grid Architectures
15.2.3.2 Grid Components
15.2.4 Layered Grid Architecture: The Globus Architecture
15.2.5 Virtual Machine Environment: The Legion Architecture
15.2.6 Cycle Scavenging Schemes: The Condor System
15.2.7 Data Grids
15.2.7.1 Kangaroo
15.2.7.2 Legion
15.2.7.3 Storage Resource Broker
15.2.7.4 Globus Data Grid Tools
15.2.8 Research Issues and Challenges for Grids
15.2.8.1 Software Engineering Problems
15.2.8.2 Load Balancing and Scheduling
15.2.8.3 Autonomic Computing
15.2.8.4 Replication
15.3 Peer-to-Peer (P2P) Computing
15.3.1 What is Peer-to-Peer Computing?
15.3.2 Possible Application Areas for P2P Systems
15.3.3 Some Existing P2P Projects
15.3.4 P2P File Sharing and its Legal implications
15.3.4.1 P2P File Sharing Systems
418 419 419 419 420 424 425 427 427 428 428 429 429 430 431 432 432 432
435
435 436 439 440 440 441 444 444 444 447 449 452 453 454 454 455 455 457 457 458 459 459 461 461 462 462 464 464
Trang 16xv15.3.4.2 Legal implications for P2P File Sharing
15.3.5 Some Challenges for P2P Computing
15.4 Pervasive Computing
15.4.1 Pervasive Computing Characteristics
15.4.2 Elite Care: An Application Using Pervasive Computing
15.4.3 The Challenges for Pervasive Computing
472 509
Trang 17This page intentionally left blank
Trang 18Both authors have taught the course of “Distributed Systems” for many years in therespective schools During the teaching, we feel strongly that “Distributed systems”have evolved from traditional “LAN” based distributed systems towards “Internetbased” systems Although there exist many excellent textbooks on this topic,because of the fast development of distributed systems and networkprogramming/protocols, we have difficulty in finding an appropriate textbook for thecourse of “distributed systems” with orientation to the requirement of theundergraduate level study for today’s distributed technology Specifically, from up-to-date concepts, algorithms, and models to implementations for both distributedsystem designs and application programming
Thus the philosophy behind this book is to integrate the concepts, algorithm designsand implementations of distributed systems based on network programming Afterusing several materials of other textbooks and research books, we found that manytexts treat the distributed systems with separation of concepts, algorithm design andnetwork programming and it is very difficult for students to map the concepts ofdistributed systems to the algorithm design, prototyping and implementations.This book intends to enable readers, especially postgraduates and seniorundergraduate level, to study up-to-date concepts, algorithms and networkprogramming skills for building modern distributed systems It enables students notonly to master the concepts of distributed network system but also to readily use thematerial introduced into implementation practices
The book takes an integrated approach to view the distributed system as a set ofprogramming blocks cooperating on distributed sites The primary objective of theconcept, design and implementation is to meet the requirements or distributedapplications based on the networking environment In this book, networking anddistribution design for applications are represented in the form of severaldimensions Therefore, the book describes the distributed systems along a line fromgeneral distributed system requirement of applications to system transparency thatreflect system structure and algorithm designs and implementation techniques.The striking features of the book, differs from others, can be illustrated from twobasic aspects:
(1) The viewpoint of applications, i.e., what kinds of concepts and programmingskill are fitted for the design of distributed systems and applications
(2) The viewpoint of system designer and implementers, i.e., the system layers andtheir mapping to the design of distributed algorithms and their implementations.The book not only provides the basic distributed systems and networks protocols(such as RPC, group communication and Mobile IP), but it also presents thediscussion of recent technology development for Internet such as IP for nextgeneration (IPv6 and multicast and anycast communication) As Web/Java
Trang 19technology is getting important and popular nowadays, this book illustrates how
a distributed system and network protocols can be designed and implemented withdistributed system concepts and network programming in today’s Internetenvironment
The book is composed of 15 chapters Most chapters contain substantial materialsabout concepts, algorithm designs and implementation techniques The outline of thebook is given below
Chapter 1 Overview of Distributed Systems: This chapter outlines the basicconcepts of distributed systems and computer networks, such as their purposes,characteristics, advantages, and limitations, as well as their basic architectures,networking and applications
Chapter 2 introduces the client-server model and its role in the development ofdistributed network systems The chapter discusses the cooperation between clientsand servers/group servers in distributed network systems, and addresses extensions
to the client-server model Service discovery, which is of crucial importance forachieving transparency in distributed network systems, is also elaborated in thischapter
Chapter 3 Communication is an important issue in distributed computing systems:This chapter addresses the communication paradigm of distributed network systems,i.e., issues about how to build the communication model for these systems
Chapter 4 Internetworking Network software is arranged in a hierarchy of layers:Each layer presents an interface to the layers above it that extends the properties ofthe underlying communication system Network functions are achieved through thelayered protocols This chapter discusses the communication protocols in a network,especially, TCP/IP protocols used on the current Internet The next generation ofInternet protocol – IPv6 is also addressed in the chapter
Chapter 5 Interprocess Communication using Message-Passing: Processes in adistributed network system normally do not share common memory Therefore,message-passing is one of the effective communication mechanisms between theseprocesses In this chapter we discuss the most commonly used message-passingbased interprocess communication mechanism, i.e., the socket API
Chapter 6 TCP/UDP Communication in Java: In this chapter we want to address theTCP/UDP programming in Java, since the Java language is currently the mostcommonly used language to implement a distributed computing system Javaprovides the reliable stream-based communication for TCP as well as the unreliabledatagram communication for UDP
Chapter 7 Interprocess Communication using RPC: When using message-passingfor interprocess communications, a programmer is aware of the passing of messagesbetween the two processes However, in a remote procedure call situation, passing ofmessages is invisible to the programmer Instead, a language-level concept, theprocedure call, is used to mask the actual communication between two processes Inthis chapter we discuss two commonly used RPC tools, the DCE/RPC and theSUN/RPC We have developed a RPC tool, called the Simple RPC tool, which will
be described in the chapter The idea of RPC has been extended to develop
Trang 20xixinterprocess communication mechanisms for object-oriented paradigm, notablythe Remote Method Invocation (RMI) in Java We also introduce this mechanism inthe chapter.
Chapter 8 Group Communications is highly desirable for maintaining a consistentstate in distributed systems Many existing protocols are quite expensive and oflimited benefit for distributed systems in terms of efficiency This chapter describesconcepts and design techniques of group communication protocol including messageordering, dynamic assessment of membership and fault tolerance The protocolensures total ordering of messages and atomicity of delivery in the presence ofcommunication failures and site failures, and guarantees that all operationalmembers belonging to the same group observe a consistent view of ordered events.The dynamic membership and failure recovery algorithms can handle site failuresand recovery; group partitions and merges; dynamic members join and leave.Chapter 9 Reliability and Replication Techniques: A computer system, or adistributed system consists of many hardware/software components that are likely tofail eventually In many cases, such failures may have disastrous results With theever-increasing dependency being placed on distributed systems, the number ofusers requiring fault tolerance is likely to increase The design and understanding offault-tolerant distributed systems is a very difficult task We have to deal with notonly all the complex problems of distributed systems when all the components arewell, but also the more complex problems when some of the components fail Thischapter introduces the basic concepts and techniques that relate to fault-tolerantcomputing
Chapter 10 Security: There is a pervasive need for measures to guarantee theprivacy, integrity and availability of resources in distributed network systems.Designers of secure distributed systems must cope with exposed service interfacesand insecure networks in an environment where attackers are likely to haveknowledge of the algorithms used and to deploy computing resources In this chapter
we talk about security issues of distributed network systems, such as integritymechanisms and encryption techniques, and in particular, the techniques for defenseagainst Distributed Denial-of-Service attacks
Chapter 11 A Reactive System Architecture for Fault-Tolerant Computing: Mostfault-tolerant application programs cannot cope with constant changes in theirenvironments and user requirements because they embed fault-tolerant computingpolicies and mechanisms together so that if policies or mechanisms are changed thewhole programs have to be changed This chapter presents a reactive systemapproach to overcoming this limitation The reactive system concepts are anattractive paradigm for system design, development and maintenance because itseparates policies from mechanisms In the chapter we propose a generic reactivesystem architecture and use group communication primitives to model it We thenimplement it as a generic package, which can be applied in any distributedapplications The system performance shows that it can be used in a distributedenvironment effectively
Chapter 12 Web-Based Databases: World Wide Web has changed the way we dobusiness and research It also brings a lot of challenges, such as infinite contents,resource diversity, and maintenance and update of contents Web-based database
Trang 21(WBDB) is one of the answers to these challenges In this chapter, we classifyWBDB architectures into three types: two-tier architecture, three-tier architecture,and hybrid architectures, according to WBDB access methods Then the existingtechnologies used in WBDB are introduced as various generations, i.e thetraditional Web (generation 1), fast and more interactive Web (generation 2), Java-based Web (generation 3), and a new generation combining the techniques of XMLand mobile agents Based on the introduction, we provide the challenges and somesolutions for current WBDB Finally we outline a future framework of WBDB.Chapter 13 Mobile Computing: Mobile computing requires wirelesscommunication, mobility and portability In the past few years, we have seen anexplosion of mobile devices over the world such as notebook, multimedia PDA andmobile phones The rapidly expanding markets of cellular voice and limited dataservice have created a great demand for mobile communication and computing.Mobile communications applications include mobile computing and wirelesscommunications Many of the advances in communications involve the use ofInternet Protocol (IP), Asynchronous Transfer Mode (ATM), and ad hoc networkprotocols Recently much focus has been directed at advancing communicationtechnology in the area of mobile wireless networks especially on the IP basedwireless networks This chapter focuses on two major issues: Mobile IP and mobilemulticast / anycast applications
Chapter 14 Distributed Network Systems: Case Studies In the previous chapters wehave discussed various aspects of distributed network systems Distributed networksystems are now used everywhere, especially on the Internet In this chapter westudy several well-known distributed network systems, as the examples of ourdiscussion
Chapter 15 Distributed Network Systems: Current Development This last chapteroutlines the most recent development in distributed network systems In particular,
we present four “hot” topics that have attracted a lot of attention from both academiaand industry These topics include: cluster computing, grid computing, peer-to-peercomputing, and pervasive computing For each topic, we try to outline its currentdevelopment, its potential applications and benefits, and its challenges The purpose
of this chapter is to broaden the reader’s knowledge in distributed network systems.The book is suitable to any one who needs a informative introduction, basic designand programming strategies of distributed systems and applications It serves as anidea textbook of one-semester course for senior undergraduates and post-graduates.Chapters 1-6 serve as the basis for the distributed system design and networkprogramming There are diverse objectives for using the book: (1) For learning ofdistributed operating system design and implementations: Chapters 7, 8, 9, 10, and
14 can serve the purpose (2) For readers who are interested in the design andimplementations of web-based databases and Internet computing, Chapters 7, 8, 12and 15 can be used (3) To learn the concepts of fault-tolerant distributed systemdesign, Chapters 8,9, 11 will serve the purpose (4) For understanding group, RPCcommunication protocols and Mobile IP, Chapters 8, 10, 13 will help
Trang 22We are grateful to many classes students at City University of Hong Kong andDeakin University who have made a lot of feedbacks to our teaching materials astheir comments inspire us to write this book Inspirations also come from Dingzhu
Du, Wei Zhao, Qing Li and Andrzej Goscinski
The following people gave their time to help us to formulate the book, especially,Changgui Chen, who helped to edit the book and contributed partially to Chapter 13.Yang Xiang contributed partially to Chapter 10; Mingjun Lan contributed partially
to Chapter 12; and John Casey contributed partially to Section 15.2 Pui-On Au andYujia Wang helped to format the final version of the book
We would like to acknowledge some support from research grants we have received,
in particular, CityU Strategic grant nos 7001587/7001446 and UGC grant nos.CityU 1055/01E and CityU 1076/00E, the Austalian Research Council Small Grant
no 0504-32409-0132-3501 and the Deakin University Research Grant
0504-23434-3101 Although the research grants are not directly used to support the writing of thebook, some interesting research results presented in the book are taken from ourresearch papers which indeed (partially) supported through these grants We alsowould like to express our appreciations to the editors in Kluwer AcademicPublishers, especially John Martindale and Angela Quilici, for their excellentprofessional support
Finally we are grateful to the family of each of us for their consistent and persistentsupports Weijia would like to present the book to XieMei and Sally Wanlei wouldlike to present the book to Ling, Lingdi and Andi Without their support, the bookmay just become some unpublished discussions
Weijia
Wanlei
1-May-04
Trang 23This page intentionally left blank
Trang 24Biography of Authors
Dr Weijia Jia is an Associate Professor in Department of Computer Science andDepartment of Computer Engineering and Information Tech., City University ofHong Kong He received his BSc and MSc in Computer Science from Center SouthUniversity (CSU), Changsha, China in 1982 and 1984, respectively He joined the,CSUT as an Assistant Lecturer in 1984 From 1987 to 1988, as a guest researcher heworked at the Department of Computer Science, University of Ottawa, Canada.From 1988 to 1991, he was a Lecturer in Department of Computer Science, CSU In
1993, he received his PhD in Computer Science from Faculty Polytechnic of Mons,Belgium and joined German National Research Center for Information Technology(GMD) in St Augustin as a research fellow In 1995 he joined the Department ofComputer Science, City University of Hong Kong as an assistant professor Hisresearch interest includes computer network and systems with emphasis onparallel/distributed object group system, communication protocols, real-time andInternet communications He has published extensively in these fields, especially thefield of Anycast routing and applications He is a member of IEEE, IEEECommunication Society and IEEE Computer Society
Dr Wanlei Zhou is a Chair Professor and Head of School of InformationTechnology, Deakin University, Melbourne, Australia Dr Zhou received the B.Engand M.Eng degrees from Harbin Institute of Technology, Harbin, China in 1982 and
1984, respectively, and the PhD degree from The Australian National University,Canberra, Australia, in 1991 Before joining Deakin University, Dr Zhou has been aLecturer in Chengdu Institute of Radio Engineering (University of ElectronicScience and Technology of China), China, a programmer in Apollo/HP atMassachusetts, U.S.A., a Lecturer in National University of Singapore, Singapore,and a Lecturer in Monash University, Melbourne, Australia His research interestsinclude distributed computing, computer networks, IT security, performanceevaluation, and fault-tolerant computing, and he has published extensively in theseresearch areas Dr Zhou is a member of the IEEE and IEEE Computer Society, andthe ACM
Trang 25This page intentionally left blank
Trang 26Table of Figures
Figure 2.1 The basic client-server model
Figure 2.2 Printing service (a service example)
Figure 2.3 Indirect client-server cooperation
Figure 2.4 Examples of three-tier configurations
Figure 2.5 An example implementation of the three-tier architecture
Figure 2.6 Service discovery broadcast approach
Figure 2.7 Service discovery name server and server location lookup
Figure 2.8 A distributed computing system architecture
Figure 3.1 CORBA CDR message
Figure 3.2 Time diagram of the execution of message-passing primitives
Figure 3.3 Send primitives: (a) blocking; (b) non-blocking
Figure 3.4 Blocked send primitive
Figure 3.5 Unbuffered and buffered message passing
Figure 3.6 Message passing; (a) unreliable; (b) reliable
Figure 3.7 Message-passing semantics (a) at-least-once; (b) exactly-once
Figure 3.8 An RPC example: a read call
Figure 3.9 Group structures
Figure 4.1 The layered protocol model
Figure 4.2 The OSI reference model
Figure 4.3 Comparison of Internet and OSI architectures
Figure 4.4 The Layered TCP/IP protocol suite
Figure 5.1 The distributed application model
Figure 5.2 BSD interprocess sockets
Figure 5.3 File and socket descriptors
Figure 5.4 Socket model
Figure 7.1 DCE architecture
Figure 7.2 Build a DEC Application
Figure 7.3 Using threads in a client-server application
Figure 7.4 The CDS directory hierarchy
Figure 7.5 Components of the DCE directory service
Figure 7.6 Time synchronisation using intervals
Figure 7.7 Time synchronisation within a multi-LAN cell
Figure 7.8 Interactions between DFS components
Figure 7.9 The RMI architecture
Figure 8.1 Causal ordering rule (Group G={S1, S2, S3})
Figure 8.2 Causal ordering and total ordering
Figure 8.3 Reliable multicast system architecture
Figure 8.4 A group comprises of n+1 sites
Figure 8.5 Groups
Figure 8.6 A group of n members form a logical token ring
Figure 8.7 Logical token ring structure and normal operations
Figure 8.8 Message transmission example
Figure 8.9 Dynamic membership
Figure 8.10 System structure
Figure 8.11 RMP hierarchy structure
Figure 8.12 Packet for logical ring
17 18 21 22 23 26 27 29 35 36 39 39 41 42 43 46 52 66 66 68 70 82 83 83 84 136 138 139 140 141 143 144 145 164 179 180 180 183 184 184 191 192 196 201 203 204
Trang 27Figure 8.13 Ordered multicast
Figure 8.14 Steps taken for RMP to multicast an ordered message
Figure 9.1 A system
Figure 9.2 Fault, error, and failure
Figure 9.3 The bathtub curve
Figure 9.4 Relationships between MTBF, MTTF, and MTTR
Figure 9.5 Reconfigurable duplication architecture
Figure 9.6 Reliability block diagram of a series system
Figure 9.7 The example reliability block diagram
Figure 9.8 Reliability block diagram of the parallel system
Figure 9.9 Example reliability block diagram
Figure 9.10 Reduced reliability block diagram
Figure 9.11 State diagram of a TRM system
Figure 9.12 Reduced state diagram of a TMR system
Figure 9.13 Reliability block diagram of a simple parallel system
Figure 9.14 Impact of the fault coverage
Figure 9.15 Reliability comparison of TMR and a single module
Figure 9.16 Deadlock
Figure 9.17 False deadlock
Figure 9.18 Distributed deadlock detection
Figure 9.19 The Primary-Backup Replication Scheme
Figure 9.20 Hot replication implementations
Figure 9.21 The active replication scheme
Figure 9.22 The scenario of requests arriving in different orders
Figure 9.23 The Primary-Peer Replication Scheme
Figure 10.1 Single (private) key encryption
Figure 10.2 Key distribution server
Figure 10.3 Public key encryption
Figure 10.4 Packet filter in a router
Figure 10.5 Internet firewall
Figure 10.6 A hierarchical model of a DDoS attack
Figure 10.7 Key methods used before making an effective DDoS attack
Figure 10.8 Active defense cycle
Figure 11.1 The generic reactive system architecture
Figure 11.2 A DMM agent
Figure 11.3 Sensors and actuators
Figure 11.4 Tunnelling multicast packets between subnets
Figure 11.5 The generic sensor architecture
Figure 11.6 A distributed replication system
Figure 11.7 Replication manager and database server
Figure 11.8 Using polling sensors
Figure 11.9 Using event sensors
Figure 11.10 Using embedded DMMs
Figure 11.11 Using polling sensors for network partitioning
Figure 11.12 Using event sensors for partition-tolerant applications
Figure 12.1 Two-tier architecture of WBDB
Figure 12.2 Three-tier architecture of WBDB
Figure 12.3 Hybrid architecture of WBDB ( agent-based)
206 207 214 214 215 218 222 227 228 228 229 229 230 231 231 232 233 237 238 238 243 244 245 247 250 256 258 258 264 264 266 270 282 296 298 301 306 307 312 313 316 316 317 318 319 329 330 331
Trang 28xxviiFigure 12.4 Generation 1 framework (CGI-based) of WBDB
Figure 12.5 Generation 2 framework (Client and Server-side JavaScript) of WBDB Figure 12.6 Generation 3 (JDBC-based) framework of WBDB
Figure 12.7 Servlet-based framework of WBDB
Figure 12.8 XML–based two-tier framework of WBDB
Figure 12.9 Mobile agent involved framework of WBDB
Figure 12.10 Intelligent interactive framework of WBDB
Figure 13.1 Example of Mobile applications
Figure 13.2 IP Tunneling
Figure 13.3 Operation of Mobile Node under Mobile IP
Figure 13.4 Operation of Mobile IP on care-of address
Figure 13.5 Operation of Mobile IP on collocated care-of address
Figure 13.6 ICMP Router Advertisement and Mobility Agent Advertisement Extension Message
Figure 13.7 ICMP Router Solicitation Message
Figure 13.8 The mobility agents (either home or foreign) multicast Agent
Advertisement
Figure 13.9 The message format of Registration Request and Mobile-Foreign Authentication Extension
Figure 13.10 The message format of Registration Reply
Figure 13.11 Bi-directional Tunneled Multicast Method
Figure 13.12 MMP topology and Mobile Connections
Figure 13.13 Network Topology of the Simulation
Figure 13.14 Message delivery delays
Figure 13.15 Number of delivered messages
Figure 14.1 A distributed file system structure
Figure 14.2 NFS structure
Figure 14.3 OMA Reference Architecture
Figure 14.4 CORBA client and server
Figure 14.5 CORBA architecture
Figure 14.6 Interface inheritance and implementation inheritance
Figure 14.7 Distributed COM is built on top of DCE RPC
Figure 14.8 DCOM security
Figure 14.9 DCOM binary specification
333 334 335 337 337 339 345 369 373 377 377 378 379 379 385 387 389 397 398 403 404 405 408 409 420 421 422 425 428 430 431
Trang 29This page intentionally left blank
Trang 30CHAPTER 1 OVERVIEW OF DISTRIBUTED NETWORK SYSTEMS
In this Chapter we outline the basic concepts of distributed systems and computernetworks, such as their purposes, characteristics, advantages, and limitations, aswell as their basic architectures and their applications
1.1 Distributed Systems
A distributed system is a system consisting of a collection of autonomous machinesconnected by communication networks and equipped with software systemsdesigned to produce an integrated and consistent computing environment.Distributed systems enable people to cooperate and coordinate their activities moreeffectively and efficiently The key purposes of the distributed systems can berepresented by: resource sharing, openness, concurrency, scalability, fault-toleranceand transparency [Coulouris et al 1994]
Resource sharing In a distributed system, the resources - hardware, software
and data can be easily shared among users For example, a printer can be sharedamong a group of users
Openness The openness of distributed systems is achieved by specifying the
key software interface of the system and making it available to softwaredevelopers so that the system can be extended in many ways
Concurrency The processing concurrency can be achieved by sending requests
to multiple machines connected by networks at the same time
Scalability A distributed system running on a collection of a small number of
machines can be easily extended to a large number of machines to increase theprocessing power
Fault-tolerance Machines connected by networks can be seen as redundant
resources, a software system can be installed on multiple machines so that inthe face of hardware faults or software failures, the faults or failures can bedetected and tolerated by other machines
Transparency Distributed systems can provide many forms of transparency
such as:
Location transparency, which allows local and remote information to be
accessed in a unified way;
1)
Trang 31Failure transparency, which enables the masking of failures automatically;
and
Replication transparency, which allows duplicating software/data on
multiple machines invisibly
2)
3)
Computing in the late 1990s has reached the state of Web-based distributedcomputing A basis of this form of computing is distributed computing which iscarried out on distributed computing systems These systems comprise thefollowing three fundamental components:
personal computers and powerful server computers,
local and fast wide area networks, internet, and
systems, in particular distributed operating systems, and application software
In this book we are interested in the last two issues of distributed computing
systems: networks and system and application software.
With the flourishing of the Internet and the current quick development of commerce, it is very important in designing distributed systems to consider not onlytraditional applications but also the requirements of distributed computing based onthe Internet
e-1.2 Computer Networks
1.2.1 Network History
The following table ([Stallings 1998]) shows a brief networking history
Trang 321.2.2 Network Architecture
The early success of the ARPANET (sponsored by the Advanced Research ProjectsAgency (ARPA) and developed during the late 1960s and early 1970s) and othernetworks, and the immediate commercial potential of packet switching, satellite,and local network technology made it apparent that computer networking wasquickly becoming an important area of innovation and commerce It was alsoapparent that to utilize the full potential of such computer networks, internationalstandards would be required to ensure that any system could communicate with anyother system anywhere in the world
In 1978, a new subcommittee (SC16) was created by the International Organizationfor Standardization (ISO) Technical Committee 97 on Information Processing todevelop standards for “open system interconnection (OSI)” The term “open” waschosen to emphasize that by conforming to OSI standards, a system would be open
to communication with any other system anywhere in the world obeying the samestandards
The OSI reference model is a seven-layer model for inter-process communication.Its architecture is comprised of application, presentation, session, transport,network, data link and physical layers, and the corresponding protocols, as depicted
in Table 1.2 The detailed descriptions of these layers are given in Chapter 4
The early-developed ARPANET adopts another type of network architecture, i.e.,four-layer architecture: application, transport, Internet, and network interface, asdepicted in Table 1.3 The current Internet based on ARPANET uses this
Trang 33architecture, which is also known as TCP/IP reference model In this model, thenetwork interface (or access) layer relies on the data link and physical layers of thenetwork, and the application layer corresponds to application and presentationlayers of the OSI model, since there is no session layer in the TCP/IP model Thedetail of these layers is also given in Chapter 4
1.2.3 Network Fault Tolerance
Network reliability refers to the reliability of the overall network to providecommunication in the event of failure of a component or components in the
network The term network fault tolerance refers to how resilient the network is
against the failure of a component
Why fault tolerance in a networked world? A key indicator of today’s globalbusiness systems is the reliability and uptime [Grimshaw et al 1999] This concern
is crucial for e-commerce sites and mission-critical business applications Expensiveand powerful servers and system components that are designed as stand-alonesystems can be very reliable, but even an hour of downtime per month can bedeadly to online-only businesses
For example, server clusters are increasingly used in business and academia tocombat the problems of reliability since they are relatively inexpensive and easy tobuild [Buyya 1999] [TBR 1998] By having multiple network servers workingtogether in a cluster and using redundant components such as more than one powersupply and RAID hard drive subsystems, the overall system uptime in theory canapproach 100 percent However, server clusters are only a part of a chain that linksbusiness applications together For example, to access an HTML page of a businessweb site, a user issues a request that travels from the user’s client machine, through
a number of routers and firewalls and other network devices to reach the web site.The web site then processes the request and returns the requested HTML page viathe same or another chain of routers, firewalls and network devices The strength ofthis chain, in terms of reliability and performance, will determine the success orfailure of the business, but a chain is only as strong as its weakest link, and thelonger the chain, the weaker it is in general Intuitively, the following two ways can
be used to make such a chain stronger: one is the use of redundancy (replication)and concurrency (parallelism) techniques, and the other is to increase the reliability
of the weakest link of the chain
Trang 34A networked world faces a number of challenges in fault tolerance In particular,Internet-connected resources have the following characteristics:
Unreliable communications;
Unreliable resources (computers, storage, software, etc.);
Highly heterogeneous environment;
Potentially very large amount of resources: scalability;
Potentially highly variable number of resources
Communication network reliability depends on the sustainability of both hardwareand software It is possible that, depending on failure senario, a variety of networkfailures can last from a few seconds to days Traditionally, such failures wereprimarily from hardware malfunctions that result in downtime (or “outage period”)
of a network element (a node or a link) Thus, the emphasis was on the level network availability and, in turn, the determination of overall networkavailability However, other types of major outages have received much attention inrecent years Such incidents include accidental fiber cable cut, natural disasters, andmalicious attack (both hardware and software) These major failures need more thanwhat is traditionally addressed through network availability
element-These types of failures cannot be addressed by congestion control schemes alonebecause of their drastic impact on the network Such failures can, for example, drop
a significant number of existing network connections; thus, the network is required
to have the ability to detect a fault and isolate it, and then either the network mustreconnect the affected connections or the user may try to reconnect it (if the networkdoes not have reconnect capability) At the same time, the network may not haveenough capacity and capability to handle such a major simultaneous “reconnect”phase Likewise, because of a software and/or protocol error, the network mayappear very congested to the user Thus, network reliability nowadays encompassesmore than what was traditionally addressed through network availability
Basic techniques used in dealing with network failures include: retry(retransmission), complemented retry with correction, replication (e.g., dual bus),coding, special protocols (single handshake, double handshake, etc.), timing checks,rerouting, and retransmission with shift (intelligent retry), etc
1.3 Protocols and QoS
Network software is arranged in a hierarchy of layers Each layer presents aninterface to the layers above it that extends the properties of the underlyingcommunication system One layer on one machine carries on a conversation withthe same layer on another machine The rules and conventions used in this
conversation are collectively known as the protocol of this layer Generally
speaking, a protocol is an agreement between the communication parties on howcommunication is to proceed The definition of a protocol has two important parts:
A specification of the sequence of messages that must be exchanged;
Trang 35A specification of the format of the data in the messages
A protocol is implemented by a pair of software modules located in the sending andreceiving computers Each network layer has one or more protocols corresponding
to it so that it can provide a service to the layer above it and extend the serviceprovided by the layer below it Hence these protocols are arranged in a hierarchy oflayers as well For example, in the OSI model, there are seven protocol layerscorresponding to each network layer A complete set of protocol layers is referred to
as a protocol suite or a protocol stack, reflecting the layered structure.
Protocol layering brings substantial benefits in simplifying and generalizing thesoftware interfaces for access to the communication services of networks, but it alsocarries significant performance costs The transmission of an application-level
message via a protocol stack with N layers typically involves N transfers of control
to the relevant layer of software in the protocol suite, at least one of which is an
operating system entry, and taking N copies of the data as a part of the
encapsulation mechanisms All of these overheads result in data transfer ratesbetween application processes that are much lower than the available networkbandwidth
Quality of service facilities in some technologies, such as Asynchronous TransferMode (ATM), can be quite detailed, providing users with explicit guarantees ofaverage delay, delay variation and data loss In ATM terminology, QoS is theperformance observed by an end user The principal QoS parameters are delay,delay variation, and loss But QoS does not necessarily guarantee particularperformance Performace guarantees can be quite difficult and expensive to provide
in packet-switched networks, and most applications and users can be satisfied withless stringent promises, such as prioritization only, without delay guarantees.QoS also defines the description of how traffic is to be classified Some QoSimplementations provide per-flow classification, in which each individual flow iscategorized and handled separately This can be expensive if there are a lot of flows
to be managed concurrently
Quality of Service (QoS) is a somewhat vague term referring to the technologiesthat classify network traffic and then ensure that some of that traffic receives specialhandling The special handling may include attempts to provide improved errorrates, lower network transit time (latency), and decreased latency variation (jitter) Itmay also include promises of high availability, which is a combination of mean(average) time between failures (MTBF) and mean time to repair (MTTR)
1.4 Software for Distributed Computing
1.4.1 Traditional Client-Server Model
The client-server model has been a dominant model for distributed computing since
the 1980s The development of this model has been sparked by research and thedevelopment of operating systems that could support users working at their personal
Trang 367computers connected by a local area network The issue was how to access, use andshare a resource located on another computer, e.g., a file or printer, in a transparentmanner In the 1980s computers were controlled by monolithic kernel basedoperating systems, where all services, including file and naming services, were part
of that huge piece of software In order to access a remote service, the wholeoperating system must be located and accessed within the computer providing theservice This implied a need to distinguish from a kernel based operating system that part of software which only provides a desired service and embody it into a
new software entity This entity is called a server Thus, each user process, a client,
can access and use that server, subject to possessing access rights and a compatibleinterface Therefore, the idea of the client-server model is to build at least that part
of an operating system which is responsible for providing services to users as a set
of cooperating server processes
In the 1990s the client-server model has been used extensively to develop a hugenumber and variety of applications The reasons are simple It is a very clean modelthat adheres well to the software modularity and usability requirements This modelallows the programmer to develop, test and install applications very quickly Itsimplifies maintenance and lowers its costs It also allows proving correctness ofcode
The client-server model has also influenced the building of new operating systems,
in particular distributed operating systems [Goscinski and Zhou 1999] A distributedoperating system supports transparency Thus, when users access their personalcomputers they have the feeling of being supported by a very powerful computer,which provides a variety of services This means that all computers and theirconnection to a communication network are hidden
However, it is not good enough to use the simple client-server model to describevarious components and their activities of a Web-based client-server computingsystem The Internet, and in particular Web browsers and further developments inJava programming, have expanded the client-server computing and systems This ismanifested by different forms of cooperation between remote computers andsoftware components
1.4.2 Web-Based Distributed Computing Models
The Internet and WWW have influenced distributed computing by the globalcoverage of the network, Web servers distribution and availability, and architecture
of executing programs To meet the requirements of quick development of theInternet, distributed computing may need to shift its environment from LAN to theInternet
At the execution level, distributed computing/applications may rely on thefollowing parts:
Processes: A typical computer operating system on a computer host can run
several processes at once A process is created by describing a sequence ofsteps in a programming language, compiling the program into an executable
Trang 37form, and running the executable program in the operating system While it isrunning, a process has access to the resources of the computer (CPU time, I/Odevice and communication ports) through the operating system A process can
be completely devoted to an application, or several applications can use a singleprocess to perform tasks
Threads: Every process has at least one thread of control Some OS support the
creation of multiple threads of control within a single process Each thread in aprocess can run independently from the other threads The threads may sharesome memory such as heap or stacks Usually the threads need synchronization.For example, one thread may monitor input from a socket connection, whileanother processes users’ requests or accesses the database
Distributed Objects: Distributed Object technologies are best typified by the
Object Management Group’s Common Object Request Broker Architecture(CORBA) [OMG 1998], and Microsoft’s Distributed Component Object Model(DCOM) [Microsoft 1998] In these approaches, interaction control amongcomponents lies solely with the requesting object an explicit method callusing a predefined interface specification to initiate service access The service
is provided by a remote object through a registry that finds the object, and thenmediates the request and its response Although Distributed Object modelsoffer a powerful paradigm for creating networked applications composed ofobjects potentially written in different programming languages, hard-codedcommunication interactions make it difficult to reuse an object in a newapplication without bringing along all services on which it is dependent, andreworking the system to incorporate new services that were not initiallyforeseen is a complex task
Agents: It is difficult to define this overused term, i.e., to differentiate it from a
process or an (active) object and how it differs from a program An agent hasbeen loosely defined as a program that assists users and acts on their behalf.This is called end-user perspective of software agents In contrast to thesoftware objects of object-oriented programming, from the perspective of end-to-end users, agents are active entities that obligate the following mandatorybehavior rules:
R1: Work to meet designer’s specifications;
R2: Autonomous: has control over its own actions provided this does not
violate R1.
R3: Reactive: senses changes in requirements and environment, being able
to act according to those changes provided this does not violate R1.
An agent may possess any of the following orthogonal properties from theperspective of systems:
Communication: able to communicate with other agents
Mobility: can travel from one host to another
Reliability: able to tolerate a fault when one occurs
Trang 389Security: appear to be trustful to the end user.
Our definitions differ from others in which agents must execute according todesign specifications
According to the client-server model there are two processes, a client, whichrequests a service from another process, and a server, which is the service provider.The server performs the requested service and sends back a response This responsecould be a processing result, a confirmation of completion of the requestedoperation or even a notice about a failure of an operation Following this, the currentimage of Web-based distributed computing can be called the Web-based client-server computing
1.4.3 Web-based Client-Server Computing
We have categorized the Web-based client-server computing systems into four
types: the proxy computing, code shipping, remote computing and agent-based computing models The proxy computing (PC) model is typically used in Web-
based scientific computing According to this model a client sends data andprograms to a server over the Web and requests the server to perform certaincomputing The server receives the request, performs the computing using theprograms and data supplied by the client and returns the result back to the client.Typically, the server is a powerful high-performance computer or it has somespecial system programs (such as special mathematical and engineering libraries)that are necessary for the computation The client is mainly used for interfacing withusers
The code shipping (CS) model is a popular Web-based client-server computingmodel A typical example is the downloading and execution of Java applets on Webbrowsers, such as Netscape Communicator and Internet Explorer According to thismodel, a client makes a request to a server, the server then ships the program (e.g.,the Java applets) over the Web to the client and the client executes the program(possibly) using some local data The server acts as the repository of programs andthe client performs the computation and interfaces with users
The remote computing (RC) model is typically used in Web-based scientificcomputing and database applications [Sandewall 1996] According to this model,the client sends data over the Web to the server and the server performs thecomputing using programs residing in the server After the completion of thecomputation, the server sends the result back to the client Typically the server is ahigh-performance computing server equipped with the necessary computingprograms and/or databases The client is responsible for interfacing with users TheNetSolve system [Casanova and Dongarra 1997] uses this model
The agent-based computing (AC) model is a three-tier model According to thismodel, the client sends either data or data and programs over the Web to the agent.The agent then processes the data using its own programs or using the receivedprograms After the completion of the processing, the agent will either send theresult back to the client if the result is complete, or send the data/program/midium
Trang 39result to the server for further processing In the latter case, the server will performthe job and return the result back to the client directly (or via the agent) Nowadays,more and more Web-based applications have shifted to the AC model [Chang and
Scott 1996] [Ciancarini et al 1996].
1.5 The Agent-Based Computing Models
The basic agent-based computing model has many extensions and variations.However, there are two areas of distinction among these models, which highlighttheir adaptability and extensibility: one is whether the interactions amongcomponents are preconfigured (hard-wired) and the other is where the control forusing components or services lies (e.g., requester/client, provider/server, mediatoretc.)
Conversational Agent Model
Conversational agent technologies model communication and cooperation amongautonomous entities through message exchange based on speech act theory Thebest-known foundation technology for developing such systems is the KnowledgeQuery and Manipulation Language (KQML) [http://www.cs.umbc.edu/kqml/],which is often used in conjunction with the Knowledge Interchange Format (KIF)[http://www.cs.umbc.edu/kqml/] In these systems, service access control also lieswith a client, which requests a service from a service broker or name server, andthen initiates peer-to-peer communication with the provider at an address provided
by the broker Although language-enriched interchanges occur, conversationalagents suffer from the same restriction as distributed objects in that the interactionsamong components are hard-coded in the requester, thus making services inflexibleand difficult to reuse and extend
Sun Jini
Sun Microsystems’ Jini [Sun 1999] extends the Java runtime environment from asingle virtual machine to a network of virtual machines In Jini, control for resourceaccess lies with the client who requests a service based on type and attributes from alookup service that holds a collection of service objects (Java object and methods)and attributes posted by providers Clients filter responses from the lookup service,downloads the service object for the selected service, and invokes remote methodswithin the provider to obtain the service Although the capability of downloadingthe interface between service requester and provider permits a dynamic andextensible assembly of resources, Jini’s model still places the burden andresponsibility for selecting, acquiring, and managing access with the client
Blackboard, Publish and Subscribe Approaches
Blackboard approaches such as FliPSiDE [Schwartz 1995] or LINDA[Schoenfeldinger 1995] allow multiple processes to communicate by reading andwriting requests and information to a global data store Requesters post requests onthe Blackboard and poll for available results; providers poll to obtain servicerequests, and use the Blackboard to post results The Blackboard enables team
Trang 4011problem-solving approaches as it can be used for posting problem subcomponentsand partial results.
Publish and subscribe approaches such as Talarian[http://www.messageq.com/communications_middleware/talarian_2.html] andActive Web [http://www.pinnaclepublishing.com/AW/AWmag.nsf/home!openform]use a centralized broker as a clearinghouse for requests and information Clientsissue a request to the broker that broadcasts it to available providers; their responsesare reflected through the broker to the client This approach is well-suited to time-critical problems, as its broadcast model facilitates quick responses
Common to these approaches is their ability to enable dynamic and flexiblecomposition of distributed components because the interaction among components
is not predefined at codetime or tightly bound at runtime But, with this flexibilitycomes a potential inherent disadvantage because neither approach providesprogrammatic control for guiding the operation, and at times this control is needed
or desired (e.g., to task a provider that best meets known requirements)
OAA’s Delegated Computing Model
The Open Agent Architecture (OAA) [http://www.ai.sri.com/~oaa/], a frameworkfor building flexible, dynamic communities of distributed software agents, enables atruly cooperative computing style wherein members of an agent community worktogether to perform computation, retrieve information, and serve user interactiontasks OAA’s approach to distributed computing shares common characteristics withcurrent distributed computing models, but is distinct in very important ways.OAA is similar to the above distributed computing models in that it encouragescreation of networked applications like Distributed Objects, permits rich andcomplex interactions like Conversational Agents, and enables building dynamic,flexible, and extensible communities of components like Jini, Blackboard, andPublish and Subscribe
A key distinguishing feature of OAA is its delegated computing model that enablesboth human users and software agents to express their requests in terms of what is
to be done without requiring specification of who is to do the work or how it should
be performed, for example, “When a message for me arrives about security, notify
me immediately.” A requester delegates control for meeting a goal to the Facilitator a specialized server agent within OAA that coordinates the activities of agents forthe purpose of achieving higher-level, often complex problem-solving objectives.The facilitator meets these objectives by making use of knowledge distributed infour locations in OAA:
The requester, which specifies a goal to the Facilitator and provides advice onhow it should be met,
Providers, who register their capabilities with the Facilitator, know whatservices they can provide, and understand limits on their ability to do so,The Facilitator, which maintains a list of available provider agents and a set ofgeneral strategies for meeting goals