Chapter 17 - Distributed control algorithms. This chapter describes the notions of correctness of a distributed control algorithm, and presents algorithms for performing five control functions in a distributed OS - mutual exclusion, deadlock handling, leader election, scheduling, and termination detection.
Trang 1in any form or by any means, without the prior written permission of the publisher, or used beyond the limited distribution to teachers and educators permitted by McGrawHill for their individual course preparation. If you are a student using this PowerPoint slide, you are using it without permission.
Trang 2Theoretical issues in distributed systems
• An OS uses two key notions to organize its operation
– Time and state
* Time is used to keep track of when an event occurred, or the order
in which events occurred
* State of processes and resources are used in scheduling and resource allocation
– These notions are hard to use in a distributed system
* Computers have their own clocks, which might show different times
Hence time is not uniquely known
* Computers have memories
So states of entities might be spread throughout the system
– We need to develop practical substitutes to these notions
Trang 3Local and global states
– We depict local and global states as follows:
* Local state of a process P k at time t: s k t
* Global state of the system at time t: S t
If a system contains n processes, its state is represented as { s 1 t , s 2 t , …, s n t }
Trang 4Change of state
• The state changes as a result of an event
– An event could be the sending or receiving of a message
– We represent an event as follows:
< process state, old state, new state, event description, channel, message >
• Event e i is < P k , s, s’, send, c, m >
Trang 5Event Precedence
• Event precedence indicates which event occurred before
or after another event
– Precedence is used to know the order in which events occurred
* It is called event ordering
* e i → e j indicates that event e i precedes e j, i.e., occurred before it
– Precedence of events in a process is known
– A causal relationship is a cause-and-effect relationship
* The event corresponding to a cause occurs before that ponding to its effect, e.g., sending and receiving of a message
corres-* It is used to find precedence of events in different processes
– Event precedence is transitive
* If e i → e j and e j → e k , then e i → e k
Trang 6Event precedence via timing diagram
• e23 → e12 because e23 is a send event for m1 and e12 is a receive event
• e22 → e23 because both events occur in process P2
• Hence e22 → e12 Similarly e22 → e13, e21 → e12, etc
Trang 7Logical Clocks
• Background
– A ‘global clock’ does not exist
– Computers have ‘local clocks’; these clocks can drift apart
– So we have a local clock in each process and synchronize local clocks when needed
* A process P i has a local clock LC i
* When P k receives a message from P i, synchronization is needed if
Time in LC k is smaller than what was the time in LC i when P i
sent the message
» This is so because of causal relationship
– Such clocks do not show ‘real’ time
* Hence they are called logical clocks
Trang 8• When an event ei occurs in a process Pi
* We represent the time-stamp of e i as ts(e i)
send event in message m
Trang 9Synchronization of Logical Clocks
• Clock synchronization rules
Trang 10Synchronization of logical clocks
• The pair associated with an event shows clock times before and after
• Logical clock in P2 is synchronized when it receives message m1
• Logical clock in P1 is synchronized when P1 receives message m3
• Logical clocks in P2 is synchronized when P2 receives message m4
Trang 11Time-stamps using logical clocks
• Can event precedence be determined by simply
comparing time-stamps of events?
– Time-stamps are not unique
* Uniqueness can be obtained by using a pair (local time, process id)
as the time-stamp
If local times are identical, process id breaks the tie
– ts(ei) < ts(ej) if ei → ej
* However, ts(e i ) < ts(e j ) is possible even if e i does not precede e j
* Hence mere examination of time-stamps is not adequate for obtaining event precedence
• Vector time-stamps provide a way out of this difficulty
Trang 12Vector clocks
• A vector clock contains n elements, where n is the
number of processes
* VC k [k] is the logical clock of P k
* VC k [l], l ≠ k, is the highest value in VC l [l] that is known to P k
* A vector time-stamp vts is obtained by copying the value of VC k
• Clock synchronization rules
Trang 13Synchronization of vector clocks
• Each triple is the vector time-stamp of an event
• P2 updates VC2[1] when it receives messages m1 and m4
• P3 updates VC3[1] and VC3[2] when it receives message m2
Trang 14Time-stamps using vector clocks
• Precedence between events can be determined by
comparing their time-stamps
* If for all k: vts(e i )[k] ≤ vts(e j )[k] but vts(e i )[l] ≠ vts(e j )[l] for some l
– ei follows ej
* If for all k: vts(e i )[k] ≥ vts(e j )[k] but vts(e i )[l] ≠ vts(e j )[l] for some l
– ei, and ej are concurrent
* For some k, l : vts(e i )[k] < vts(e j )[k] and vts(e i )[l] > vts(e j )[l]
• Vector time-stamps have the following property:
– vts(ei) < vts(ej) if and only if ei → ej
* Hence use of a pair (local time, process id) provides a total order over events
Trang 15The state of a distributed system
• The state of a distributed system is the collection of
states of individual computer systems, i.e., collection of local states
– Local states in the collection may be recorded at different times
* Such states may be inconsistent
Q: How to ensure consistency of local states?
Trang 16The state of a distributed system
• If $100 is transferred from account A to B, existing in
different nodes, and states of A and B are recorded
(a) Account A contains 900 dollars and B contains 300 dollars
(b) 800 and 400,
(c) 800 and 300 ($100 is in transit) and
(d) 900 and 400
Local states in (a), (b) and (c) are mutually consistent In (d), they
are not consistent
Trang 17Mutual Consistency of local states
• Definition
* Every message recorded as ‘received from P l ’ in P k’s state is
recorded as ‘sent to P k ’ in P l’s state
* Every message recorded as ‘received from P k ’ in P l’s state is
recorded as ‘sent to P l ’ in P k’s state
• An algorithm for recording the state of a system should ensure mutual consistency of local states of all
processes
* It is assumed that any message that has been sent but not yet received by the destination process is ‘in the system’
Trang 18A distributed computation for state recording
• Assumptions
– Processes communicate over interprocess channels
– Each channel is unidirectional and has unlimited buffering
capacity
Trang 19A timing diagram for the distributed computation
• States of the processes are recorded at tP1– tP4
Q: Are these states mutually consistent?
Trang 20A cut of a distributed system
• A cut in a timing diagram is a curve that connects the
points at which states of processes are recorded
– Points to the left of the cut in the timing diagram are in the past
of the cut
– Points to the right are in the future of the cut
– A cut represents a consistent state recording of a distributed
system if the future of the cut is closed under the precedes
relation on events, i.e., under →
* That is, if a → b and is an event a in the future of the cut, b also lies
in the future of the cut
Trang 21Consistency of a cut
• Cuts C1 and C2 are consistent; cut C3 is inconsistent
• A cut is inconsistent if some message has a
backward intersection with it, e.g message P3 → P1
Trang 22Chandy–Lamport Algorithm for state recording
• Features of the algorithm
– Channels are assumed to be FIFO
* This property is used to identify messages that are in transit over a channel
– The state of a process indicates the messages sent and
received by it
– A special message called a marker is sent to ask a process to
record its state
* A process receives markers over all channels incident on it
It records the state of the channel over which it received the marker
If it is the first marker received by it, it also records its own state
Trang 23Chandy–Lamport Algorithm
• Steps in the algorithm
– When a process wishes to initiate the state recording
* Records its own state and sends markers over all outgoing channels
* If it is the first marker
Record its own state
Record state of Ch ij as empty
Send markers over all outgoing channels
* Else record the state of the channel to contain messages ( Receivedij – Recorded_receivedij )@
@: Receivedij : Messages received over Ch ij until this instant
Trang 24Example of the Chandy–Lamport algorithm
• P1 sends m1 to P3 at time 0, P3 sends m2 to P2 at time 1
• P1 decides to record state of the system at time 2
• (a), (b), (c) are states at 0, 2+, 5+
Trang 25Recorded states of processes and channels
Trang 26Properties of the recorded state
• The system may not have been in the recorded state at any time
– Example: A recorded state of the system on slide 24 that does not match any actual state of the system
Trang 27Uses of the recorded state
• Q: What use is the recorded state if it does not match
any of the actual states of the system?
– A stable property can be detected by using the algorithm
repeatedly
* A stable property is one that remains true once it becomes true
* For example, cycles in WFG or RRAG
– It may not be detected in a state recorded after it became true
– However, it would be detected in some future state recording