Lecture Operating systems: A concept-based approach (2/e): Chapter 15 - Dhananjay M. Dhamdhere

Chapter 15 - Distributed operating systems. This chapter discusses important features of these components and the manner in which these features influence the computation speedup, reliability, and performance that can be achieved in a distributed system.

Trang 1

Trang 2

– Thus, each node performs some OS functions

* Data used by an OS function may be spread across many computers

* Non-local data is accessed through the network

Trang 3

Distributed Systems

• Distributed systems consist of four components

– Individual computer systems

– Network connecting the computer systems

– Distributed computations

– Distributed operating system

We discuss the basics of these four components in this chapter

Trang 4

Benefits of distributed systems

• Distributed systems provide five key benefits

* Cost of enhancing a capability is α additional capability desired

 Made possible by open system standards

Trang 5

Nodes in a distributed system

• Nodes can be of different types

– Workstation

* Has a single CPU and single user

– Minicomputer

* Has a single CPU but many users

 It is also called a process pool node

– Cluster

* A group of nodes that work together in an integrated manner

Trang 7

Operating systems for a distributed system

• The OS must provide

– Resource sharing across boundaries of systems

– Computation speed-up of applications

– Reliability

– Good performance of the distributed system

• Two kinds of operating systems

– Network operating systems

* Only provide resource sharing

– Distributed operating systems

* Integrate functioning of individual computers

Trang 8

A network operating system

• The network OS layer exists between a process and the kernel of OS

• It recognizes requests for access to remote resources; implements them

• It passes other requests to the kernel

Trang 9

Distributed Operating Systems

• Differences with a conventional OS

– A distributed OS integrates functioning of individual computers and scatters processes of an application to various nodes

* Achieves computation speed-up and resource efficiency

* Helps in providing reliability

– Examples

* Windows cluster

 Node manager detects faults, failover manager provides reliability

* Sun Cluster software

 Global process management, distributed file system enable

process migration when a failure occurs

* Amoeba distributed OS

Trang 10

Reliable interprocess communication

• Communication between processes takes place through the network It raises following issues

– Naming of processes

* Processes should be able to find each other’s network addresses

 Address of a process is a pair (<host_name>, <process_id>)

 The domain name service (DNS) is a distributed service for

obtaining the IP address of each computer

* A name server is a generic name for this arrangement

– Reliability of communication

* Interprocess messages may be lost due to congestion in the network

Trang 11

Interprocess Communication (IPC) Protocols

• Processes in a distributed application communicate

through messages sent using an interprocess

communication (IPC) protocol

– IPC protocol is made reliable through three features

Trang 13

IPC Protocols

• An IPC protocol specifies actions to be performed in the sender and destination processes of a message

– A reliable protocol guarantees delivery of a message

* It has at-least-once or exactly-once properties

– A blocking protocol blocks the sender of a message until the

message is delivered

* This action simplifies the protocol and reduces its memory requirements

Trang 14

Blocking version of Request-reply-acknowledgment (RRA) protocol

• Sender site copies the message in a buffer, sends it and blocks the sender

• Receiver saves reply in buffer and sends it; resends if duplicate request recd

• Sender sends an acknowledgment (ack) of the reply

• Timeouts and retransmissions occur in both sender and receiver

Trang 15

Non-blocking version of Request-reply (RR) protocol

• Sender site buffers the request and also send it; sender is not blocked

• Receiver computes a reply if not a duplicate, buffers and also sends to sender

• Timeout and retransmission can occur in the sender

• Sender is interrupted when a reply is received to its message

Trang 16

Simplification due to idempotency

• Idempotency simplifies IPC protocols

– Definition: Idempotent computation is one that yields same result

if recomputed

* i := 5 is idempotent

* i := i + 1 is not idempotent

– A duplicate message can be reprocessed if its processing

involves idempotent computations

* Duplicate messages need not be discarded !

 It simplifies an IPC protocol

 However, may make it slower

* Read / write operations on files are idempotent

 Distributed file systems can omit discarding of duplicates

Trang 17

IPC protocols

• Q: Analyze the RRA and RR protocols and determine

– their buffering requirements

* RRA

 Destination site needs one buffer for each sender process

 Releases buffer on receiving ack or next request

* RR

 Sender may send more requests before receiving ack

 Destination site needs many buffers for same sender process

 When can a buffer be released?

– their semantics

* Both RRA and RR: Basically at-least-once semantics

* Provide exactly once semantics if duplicate requests are discarded

Trang 18

Distributed computation paradigms

• A distributed application may organize data in several

* Parts of data are kept in different nodes

A distributed computation paradigm provides effective support for

distributed computations

Trang 19

Modes of accessing remote data

• Data in a distant node can be accessed in three ways

– Remote data access

* Data is accessed over the network

* Slows down operation of the application

– Data migration

* Data is moved to the site of the application

* May complicate management of replicated data

– Process migration

* An application process is moved to the site of the data

Trang 20

Distributed Computation Paradigms

• Features of three paradigms

– Client–server computing

* A client invokes the server, the server provides a service

* Used for remote data access; not suitable for distributed computing

– Remote procedure call (RPC)

* The remote procedure is installed by a system administrator and

registered with a name server

* Provides exactly-once semantics; used for distributed computing

– Remote evaluation

* Program uses the statement: At <node> eval <code_segment>

* Compiler makes provision to transfer <code_segment> to <node>

* <code_segment> is executed at remote node and results returned

Trang 22

RPC implementation

• Client stub marshals the parameters, converts to m/c independent form,

consults name server to find location of the remote procedure

• Server stub extracts parameters, invokes procedure, packs the results

Trang 23

• Case studies

– Sun RPC

* Designed for client–server computing; has at-least-once semantics

* Interface language (XDR) and interface compiler (rpcgen)

 Rpcgen produces stubs, remote procedure and a header file

 Remote procedure accepts a single parameter

– Java RMI

* The server creates a remote object whose methods offer services

* Services are registered with the rmiregistry name server

* Client consults rmiregistry for a service, obtains a handle for it

* A serializable object can be passed as parameter, the service can

invoke its methods (resembles remote evaluation)

Trang 25

Types of networks

• The LAN is confined to a laboratory, a building or a cluster of buildings

• The WAN connects geographically distant nodes

Trang 26

Network topologies

• The star topology has a single point of failure

• Bidirectional ring can tolerate link failures, but not failure of intermediate nodes

• Fully and partially connected topologies offer tolerance of link and node failures

Trang 27

* A token circulates over the ring, has a free / busy flag

* Station transmits when it sees token with free flag

– Asynchronous transfer mode (ATM) technology

* Cell (i.e., packet) size 53 bytes: compromise for data & audio applns

* Uses virtual path—specific bandwidth is reserved on physical links

* Virtual channel is given specific bandwidth in a virtual path

Trang 28

An ATM switch

• Virtual path id is ‘translated’ by the switch: VPI should be unique only in a link

Trang 29

Connection strategies

• A connection strategy

– decides when, and for how long, to set up a connection

* A connection is a data path between processes

– It is also called a switching technique

– It influences communication efficiency and throughput of links

– Three connection strategies

Trang 30

Connection strategies

(a) All messages between the processes use the same connection

(b) A connection is set up for each message

(c) A connection is set up for each packet in a message

Trang 31

Routing strategies

• The routing function decides which network path would

be used by a connection

– It enables the system to adapt to changing traffic patterns

– Three routing strategies

Trang 32

Routing strategies

(a) Same path is used for communication between all processes in a pair of nodes(b) A path is chosen for communication between a pair of processes

(c) A path is chosen for each message or each packet

Trang 33

Network Protocols

• A network protocol is a set of rules and conventions

used to implement communication

– It addresses four concerns

* Naming of sites

* Efficient name resolution

* Communication efficiency

* Handling of faults

– It consists of a hierarchy of protocols that address specific

concerns Hence it is called a protocol stack

* The ISO protocol consists of 7 protocols

* The TCP / IP protocol consists of 4 protocols

Trang 34

The ISO protocol

• The ISO protocol consists of 7 protocol layers

– Physical layer

* Provides electrical mechanisms for bit transmission

– Data link layer

* Collects bits into frames, performs error detection and flow control

Trang 35

The ISO protocol

• The ISO protocol (contd)

– Session layer

* Initiates and terminates sessions between processes

* Provides for restart and recovery

Trang 36

Operation of the ISO protocol stack

Trang 37

The transmission control protocol / internet protocol

(TCP / IP) stack

• IP is a connectionless, unreliable protocol for communication between hosts

• TCP is a connectionoriented reliable protocol

• UDP is a connectionless, unreliable protocol

Trang 38

The TCP / IP protocol stack

• Features of the protocols

– TCP

* Connection-oriented, reliable protocol

* Uses a virtual circuit between processes, retransmits on time-out

* Employs flow control so that receiver can accept messages at the

rate at which the sender sends them

 It controls retransmission overhead

– UDP

* Connection-less, unreliable protocol

* Used in multi-media applications and video conferencing because loss of packets is tolerable

– Higher layer protocols

* File transfer, SMTP (e-mail), remote log in, etc

Trang 39

Network bandwidth and latency

• Bandwidth

– Rate at which data is transferred over the network

* Depends on link capacities, error rates and delays

• Latency

– Elapsed time between sending and receiving of a byte

* Processing time in protocol layers and delays due to network congestion contribute to it

* Typically computed for the first of the bytes to be transferred

Trang 40

Modeling a distributed system

• A distributed system is modeled as a graph

S = (N, E)

where N is the set of nodes E is the set of edges

– Two kinds of graph models are used

* In a physical model, each node is a computer system and each

edge is a communication link

* In a logical model, each node is a process and each edge is an

interprocess communication channel

 It is used to model a distributed computation

Trang 41

Uses of system models

• System models are used to determine properties of a

system or computation

– Impact of faults

* Minimum number of faults that would partition the system

– Resiliency

* k-resiliency: k is the largest number of faults which a system can

withstand without disruption in its functioning

– Latency between nodes

* Minimum latency depends on minimum number of links on a path between two nodes

– Cost of sending information to all nodes

* Number of messages needed

* Used to determine complexity of algorithms

Trang 42

Design issues in distributed operating systems

• Distributed nature of the computing environment raises five significant issues

– Transparency of resources and services

* Transparency: resource names should not depend on their locations

* It simplifies access from different locations

– Distribution of control functions

* OS functions are performed in many nodes to ensure reliability

– System performance

* Load balancing is performed to obtain resource efficiency

* Special techniques like file caching are used for scalability

Trang 43

Design issues in distributed operating systems

• Design issues (contd)

– Reliability

* Redundancy of resources is exploited to provide fault tolerance

* Special techniques like two phase commit are used

– Security

* An intruder may corrupt interprocess messages over the network

* Special techniques are used for message security and authentication

Định dạng
Số trang	43
Dung lượng	1,07 MB