Lecture Operating systems: A concept-based approach (2/e): Chapter 17 - Dhananjay M. Dhamdhere

Chapter 18 - Recovery and fault tolerance. This chapter discusses recovery and fault tolerance techniques used in a distributed operating system. Resiliency, which is a technique for minimizing the impact of a fault, is also discussed.

Trang 2

OS control functions in a distributed environment

• Special features of distributed OS control functions

* Check whether all processes of a computation, which may operate

in different computers, have completed

– Election

* Elect a coordinator for a privileged function like resource allocation

Trang 3

Nature of a distributed control algorithm

• A distributed control function offers services to both

system and user processes

– It operates in parallel with its clients

• Following terminology is used to differentiate between

the distributed control algorithm and its clients

– Basic computation: Operation of a client

* Interprocess messages used by it are called basic messages

– Control computation: Operation of the control algorithm

* Interprocess messages exchanged in the control computation are

called control messages

– Basic part and control part of a process

Trang 4

Basic and control parts of a process Pi

• The basic part of P i interacts with basic parts of other processes through

basic messages; analogously for control part of P i

• The control part provides services such as resource allocation to the

basic part

Trang 5

Correctness of a distributed control algorithm

• Processes of a distributed control algorithm exchange

control data and coordinate through control messages

– New correctness issues arise because

* Exchange of control messages incurs delays

 Control data used in processes may become stale or may appear inconsistent

– Hence correctness has two new facets

Trang 6

Liveness and safety of distributed control algorithms

Trang 7

Distributed mutual exclusion algorithms

• At any time, at most one process may be in a CS for a

data item ds

– Permission-based algorithms

* A process seeks the permission of a set of processes and enters a

CS only when all processes in the set have granted the permission

Trang 8

Ricart–Agrawala algorithm

• Steps of the algorithm

1 Process wishing to enter CS sends time-stamped requests to all other processes

2 When a process receives a request

a If it is not interested in entering CS, it sends a ‘go ahead’

immediately

b If it is also interested in entering CS, it sends a ‘go ahead’ only if the received request’s time-stamp < its own time-stamp

c If it is in a CS, it adds the request to the pending list

3 When a process receives n -1 ‘go ahead’ replies, it enters CS

4 When a process exits a CS, it sends ‘go ahead’ replies to each request in its pending list

Trang 9

Basic and control actions in Ricart–Agrawala algorithm

1, 2(b), 3

Trang 10

Maekawa algorithm

• Each process has a request set of processes; it seeks the permission of only processes in the request set (Rirepresents the request set of process Pi)

– Correctness is ensured through the following rules:

* For all P i : P i is included in R i

* For all P i , P j : R i ∩ R j is non-null

– The algorithm requires 2 x √n messages per CS entry

Trang 11

Token-based algorithm for a ring

• Only the process holding the token can enter a CS

– An abstract ring is superimposed on a system (see next slide)

– A process wishing to enter a CS sends its request along the ring and enters the CS only when it receives the token

* A Process not holding the token simply forwards the request to the next process

* If the process holds the token and is not in a CS

 It sends the token to the next process

 The token travels over the ring until it reaches the requester

* If the process holds the token and is in a CS

 The request is entered in a request queue in the token

 When the process finishes the CS, it sends token to the next process for delivery to the first process in its request queue

Trang 12

Token-based algorithm for a ring

(a) The system

(b) Abstract ring for the system: P4 has the token, requests by P2 and

P6 exist in the token’s request queue

Trang 13

Raymond’s token-based algorithm

• Features of the algorithm

– The algorithm uses an abstract inverted tree to reduce the

number of messages It has three invariants

* Each process in the system belongs to the tree

* Each process other than the P holder has only one edge, which points

to its parent in the tree

– Each process has a local request queue

* When it receives a request, it puts the requestor’s id in the queue

* When it makes a request, it puts its own id in the queue

Trang 14

(a) A system

(b) Abstract inverted tree for the system: P5 holds the token

Trang 15

1 Process wishes to make a request:

a It enters the request in its local queue

b Sends it on the out-edge if it has not sent a request earlier

2 When a process receives a request:

a It enters the request in its local queue

b Sends it on the out-edge if it has not sent a request earlier

3 When a process completes execution of a CS:

a It removes first requester id from its queue,

b sends the token to that process and inverts the edge to it

4 When a process receives the token:

a It removes first requester id from its queue

b If it is its own id, it enters a CS; otherwise, it sends the token to the

Trang 16

An example of Raymond's algorithm

(a) Process P5 is in CS. Requests made by P3 and P1 have reached it

(b) Process P5 finishes execution of CS and passes token to P4

Trang 17

Distributed deadlock handling

• A distributed computation may wish to use resources

located in many nodes of the system

– Information about allocated resources and pending requests in many nodes has to be collected

– Correctness problems may arise due to

* Delays in obtaining information

* Consistency of information

– Consider building a global wait-for graph (WFG) by collecting

information about wait-for relationships in all nodes

* Inconsistent information due to delays may cause phantom

deadlocks, i.e., declaration of deadlock when none exists

Trang 19

Diffusion computation-based deadlock detection

• Diffusion computation: used to collect info about nodes

– Diffusion phase

* Computation that has originated in one node, spreads to other nodes

 A control message called a query is sent along each edge

 The first query received by a node is called an engaging query

On receiving it, the node sends queries along all its out-edges

– Information collection phase

* Each node sends information in response to each query

 It sends a dummy reply that contains null information for a

non-engaging query

 It collects information from all replies it received, adds its own information, and sends the result as the reply to the engaging

Trang 20

Diffusion computation-based deadlock detection

1 When a process becomes blocked on a resource request: It initiates a diffusion computation as follows:

a Send queries along all out-edges in WFG

b Remember the number of queries sent; await replies

c After receiving a matching number of replies, declare a deadlock if

it has been continuously in blocked state after sending queries

2 When a process receives an engaging query: If it is blocked, it performs the following actions:

a Send queries along all out-edges in WFG

b Remember the number of queries sent; await replies

c After receiving a matching number of replies, compute and send

an engaging reply, if continuously blocked after sending queries

3 When a process receives a non-engaging query:

a Send dummy reply if continuously blocked after sending queries

Trang 21

Illustration of diffusion computation-based

distributed deadlock detection

• P2, P3 are blocked. P1 becomes blocked and sends a query but does

not receive a reply because P 4 is not blocked

• P 4 requests a resource held by P1, becomes blocked and sends a query

It would receive a reply and declare a deadlock

Trang 22

Mitchell–Merritt algorithm for distributed deadlock detection

• It is an edge chasing algorithm—control messages are

sent over WFG edges to detect cycles

– A provision is made to ensure that the cycle has not been

broken before it was detected

* Each process is assigned a public label and a private label

 The labels are identical when a process is created

 The public label of a process changes when it gets blocked on

Trang 23

Rules of Mitchell–Merritt algorithm

• Block rule changes the labels of a process when it blocks; z = inc(u, x),

where inc generates a unique label larger than both u and x

• The transmit rule changes public label of a process waiting for a process

Trang 24

Distributed deadlock prevention

• Cycles are prevented as follows:

– A pair (local time, node id) is used to time-stamp creation of a process

Trang 25

Distributed scheduling algorithms

• Computational load in nodes is balanced through the

technique of process migration

Trang 26

• Issues in distributed scheduling

– Kinds of process migration

* Preemptive migration requires transfer of state—hard to implement

* Non-preemptive migration is performed while creating a process—avoids need to transfer state

– Identifying nodes for process migration by quantifying ‘load’

* Heavily loaded nodes become sender nodes, lightly loaded nodes become receiver nodes

 Use CPU utilization as the criterion: causes overhead

 Use length of ready queue: easier to implement

– Stability of a scheduling algorithm: An algorithm is unstable if, under some conditions, its overhead is unbounded

* Excessive shuffling of processes between nodes causes instability

Trang 27

• Three kinds of distributed scheduling algorithms

– Sender initiated algorithms

* Thresholds on load are used to identify senders and receivers

* A sender node migrates a process non-preemptively at its creation

 Sender node polls other nodes to identify a receiver node

 Instability at high load: prevent by limiting the amount of polling

– Receiver initiated algorithm

* When a process completes, the node checks whether it has become a receiver and migrates a process preemptively to itself

* No instability

 At high load, senders are easy to find

Trang 28

• Three kinds of distributed scheduling algorithms (contd)

– Symmetrically initiated algorithms

* Has features of both sender and receiver initiated algorithms

 Behaves like sender initiated algorithm at low loads

 Behaves like receiver initiated algorithm at high loads

– Outline of a symmetrically initiated algorithm

* Each node maintains lists of senders, receivers and ok nodes

* A sender node polls nodes in the receivers list

 If the node is a receiver node, a process is migrated

 If the node is not a receiver, it is put into appropriate list

* Analogously, a receiver node polls nodes in the senders list

Trang 29

Performance of distributed scheduling algorithms

Trang 30

Distributed termination detection

• Processes of a distributed computation execute in

different nodes of a distributed system

– These processes perform work assigned to them

* A process is active when it is performing work, and passive when it

has no work

* Work is assigned to a process through a message

 Hence it may become active on receiving a message

– Distributed termination condition (DTC) detects whether such a

computation has completed It consists of two parts

* All processes of the computation are passive

* No basic messages are in transit

Trang 31

Distributed termination detection

• A diffusion computation-based algorithm

– Following assumptions are made

* Processes are not created or destroyed dynamically during operation of the algorithm

* Interprocess communication channels are FIFO

* Processes communicate through synchronous communication, i.e., the process sending a message is blocked until a response is

received

Trang 32

Distributed termination detection—

A diffusion computation-based algorithm

1 When a process becomes passive

a Sends “shall I declare termination?” messages on all edges

b After receiving replies to all messages: Declares termination if all replies are “yes”

2 When a process receives an engaging query

a Send queries on all edges, except the one along which it received engaging query

b After receiving replies to all messages: Send a “yes” reply to process from which it received engaging query if all received replies are “yes”; otherwise, send a “no” reply

3 When a process receives a non-engaging query

a Send a “yes” reply

Trang 33

Election algorithms

• Algorithms for a ring topology

– Algorithm 1:

1 Process P i initiates by sending (“elect me”, P i) message

2 Every process P j receiving an (“elect me”, P i) message sends an

(“elect me”, P j ) message and then forwards P i’s message

other process; it elects the highest priority process as leader

a It sends a “new coordinator” message to inform others

– Algorithm 2: Refinement of algorithm 1

* In Step 2, P j sends only one message: Its own message if its

priority is higher than P i ’s; otherwise, it sends P i’s message

Trang 34

Election algorithm

• Bully algorithm

a If a time-out occurs, it sends a “new coordinator” message to lower priority processes

b If it receives a “don’t you dare” message from a higher priority

process P j , it starts another time-out interval T 2

i If a time-out occurs, it assumes that all high priority processes have failed and starts a new election

priority process

a It sends a “don’t you dare” message to its sender

b Starts a new election by sending (“elect me”, P j) messages

Trang 35

Resource allocation in a distributed system

1. P i requests resource allocator for a specific resource

2. Resource allocator consults name server, finds id of the resource

3. Resource allocator informs requester and resource manager of resource

Trang 36

Process migration

• Process migration is performed for load balancing

– Difficulties

* Process state is distributed in various data structures of the OS

* Process id’s may change due to migration

 Process id’s are used in interprocess communication

 Solution: Use global process ids as in Sun cluster

* Delivery of messages requires a special provision

 A node receiving a message would redirect it if the destination process has migrated out of it

» This residual state causes poor reliability

 Alternatively, all processes may be informed when a process is migrated

» Requires a complex protocol

Định dạng
Số trang	36
Dung lượng	766,55 KB