Concepts, Techniques, and Models of Computer Programming - Chapter 11 docx

It does distributed garbage collection, i.e., not reclaiming a local entity if there is still some remote reference.. We refine this model to make the distribution model,which defines th

Trang 1

Chapter 11

Distributed Programming

A distributed system is a set of computers that are linked together by a network.

Distributed systems are ubiquitous in modern society The canonical example

of such a system, the Internet, has been growing exponentially ever since itsinception in the late 1970’s The number of host computers that are part of

it has been doubling each year since 1980 The question of how to program adistributed system is therefore of major importance

This chapter shows one approach to programming a distributed system Forthe rest of the chapter, we assume that each computer has an operating system

that supports the concept of process and provides network communication

Pro-gramming a distributed system then means to write a program for each processsuch that all processes taken together implement the desired application For theoperating system, a process is a unit of concurrency This means that if we ab-stract away from the fact that the application is spread over different processes,this is just a case of concurrent programming Ideally, distributed programmingwould be just a kind of concurrent programming, and the techniques we haveseen earlier in the book would still apply

Distributed programming is complicated

Unfortunately, things are not so simple Distributed programming is more plicated than concurrent programming for the following reasons:

com-• Each process has its own address space Data cannot be transferred from

one process to another without some translation

• The network has limited performance Typically, the basic network

opera-tions are many orders of magnitude slower than the basic operaopera-tions insideone process At the time of publication of this book, network transfer time

is measured in milliseconds, whereas computational operations are done innanoseconds or less This enormous disparity is not projected to change forthe foreseeable future

Trang 2

• Some resources are localized There are many resources that can only be

used at one particular computer due to physical constraints Localizedresources are typically peripherals such as input/output (display screen,keyboard/mouse, file system, printer) They can be more subtle, such as

a commercial application that can only be run on a particular computerbecause it is licensed there

• The distributed system can fail partially The system consists of many

components that are only loosely connected It might be that part of thenetwork stops working or that some of the computers stop working

• The distributed system is open Independent users and computations

co-habit the system They share the system’s resources and they may compete

or collaborate This gives problems of security (protection against maliciousintent) and naming (finding one another)

How do we manage this complexity? Let us attempt to use the principle ofseparation of concerns According to this principle, we can divide the probleminto an ideal case and a series of non-ideal extensions We give a solution for theideal case and we show how to modify the solution to handle the extensions

The network transparency approach

In the ideal case, the network is fast, resources can be used everywhere, all puters are up and running, and all users trust one another In this case there

com-is a solution to the complexity problem: network transparency That com-is, we plement the language so that a program will run correctly independently of how

im-it is partim-itioned across the distributed system The language has a distributedimplementation to guarantee this property Each language entity is implemented

by one or more distribution protocols, which all are carefully designed to respectthe language semantics For example, the language could provide the concept of

an object An object can be implemented as a stationary object, which means

that it resides on one process and other processes can invoke it with exactly thesame syntax as if it were local The behavior will be different in the nonlocal case(there will be a round trip of network messages), but this difference is invisiblefrom the programmer’s point of view

Another possible distribution protocol for an object is the cached object In

this protocol, any process invoking the object will first cause the object to becomelocal to it From then on, all invocations from that process will be local ones (untilsome other process causes the object to move away) The point is that both

stationary and cached objects have exactly the same behavior from the language

point of view

With network transparency, programming a distributed system becomes ple We can reuse all the techniques of concurrent programming we saw through-out the book All the complexity is hidden inside the language implementation

Trang 3

This is a real complexity, but given the conditions of the ideal case, it can be

realistically implemented It provides all the distribution protocols It translates

data between the address spaces of the processes Translating to serial form is

called marshaling and translating back is called unmarshaling The term

serial-ization is also used It does distributed garbage collection, i.e., not reclaiming a

local entity if there is still some remote reference

The idea of making a distributed language operation similar to a local

lan-guage operation has a long history The first implementation was the Remote

Procedure Call (RPC), done in the early 1980’s [18] A call to a remote procedure

behaves in the same way, under ideal conditions, as a local procedure Recently,

the idea has been extended to object-oriented programming by allowing methods

to be invoked remotely This is called Remote Method Invocation (RMI) This

technique has been made popular by the Java programming language [186]

Beyond network transparency

Network transparency solves the problem in the ideal case The next step is to

handle the non-ideal extensions Handling all of them at the same time while

keeping things simple is a research problem that is still unsolved In this chapter

we only show the tip of the iceberg of how it could be done We give a practical

introduction to each of the following extensions:

• Network awareness (i.e., performance) We show how choosing the

dis-tribution protocol allows to tune the performance without changing the

correctness of the program

• Openness We show how independent computations can connect

togeth-er In this we are aided because Oz is a dynamically-typed language: all

type information is part of the language entities This makes connecting

independent computations relatively easy

• Localized resources We show how to package a computation into a

component that knows what localized resources it needs Installing this

component in a process should connect it to these resources automatically

We already have a way to express this, using the concept of functor A

functor has an import declaration that lists what modules it needs If

resources are visible as modules, then we can use this to solve the problem

of linking to localized resources

• Failure detection We show how to detect partial failure in a way usable

to the application program The program can use this information to do

fault confinement and possibly to repair the situation and continue

work-ing While failure detection breaks transparency, doing it in the language

allows to build abstractions that hide the faults, e.g., using redundancy to

implement fault tolerance These abstractions, if desired, could be used to

regain transparency

Trang 4

This brief introduction leaves out many issues such as security, naming, resourcemanagement, and building fault tolerance abstractions But it gives a goodoverview of the general issues in the area of distributed programming.

Structure of the chapter

This chapter consists of the following parts:

• Sections 11.1 and 11.2 set the stage by giving a taxonomy of distributed

systems and by explaining our distributed computation model

• Sections 11.3–11.6 show how to program in this distribution model We

first show how to program with declarative data and then with state Wehandle state separately because it involves more sophisticated and expensive

distribution protocols We then explain the concept of network awareness,

which is important for performance reasons Finally, we show some commondistributed programming patterns

• Section 11.7 explains the distributed protocols in more detail It singles out

two particularly interesting protocols, the mobile state protocol and thedistributed binding protocol

• Section 11.8 introduces partial failure It explains and motivates the two

failures we detect, permanent process failure and temporary network tivity It gives some simple programming techniques including an abstrac-tion to create resilient server objects

inac-• Section 11.9 briefly discusses the issue of security and how it affects writing

distributed applications

• Section 11.10 summarizes the chapter by giving a methodology how to build

distributed applications

11.1 Taxonomy of distributed systems

This chapter is mainly about a quite general kind of distributed system, the opencollaborative system The techniques we give can also be used for other kinds ofdistributed system, such as cluster computing To explain why this is so, we give

a taxonomy of distributed systems that situates the different models Figure 11.1shows four types of distributed system For each type, there is a simple diagram toillustrate it In these diagrams, circles are processors or computers, the rectangle

is memory, and connecting lines are communication links (a network) The figure

starts with a shared-memory multiprocessor, which is a computer that consists

of several processors attached to a memory that is shared between all of them.Communication between processors is extremely fast; it suffices for one processor

Trang 5

11.1 Taxonomy of distributed systems 717

(partial failure)

?

Distributed memory multiprocessor

Shared memory multiprocessor

multiprocessor Distributed memory with partial failure

Add distribution

Add partial failure

Add openness

Collaboration (‘‘Internet computing’’)

Open distributed system (naming and security)

High performance (‘‘cluster computing’’)

Figure 11.1: A simple taxonomy of distributed systems

to write a memory cell and another to read it Coordinating the processors, so

that, e.g., they all agree to do the same operation at the same time, is efficient

Small shared-memory multiprocessors with one to eight processors are

com-modity items Larger scalable shared-memory cache-coherent multiprocessors

are also available but are relatively expensive A more popular solution is to

connect a set of independent computers through their I/O channels Another

popular solution is to connect off-the-shelf computers with a high-speed network

The network can be implemented as a shared bus (similar to Ethernet) or be

point-to-point (separately connecting pairs of processors) It can be custom or

use standard LAN (local-area network) technology All such machines are

usual-ly called clusters or distributed-memory multiprocessors They usualusual-ly can have

partial failure, i.e., where one processor fails while the others continue In the

figure, a failed computer is a circle crossed with a large X With appropriate

hardware and software the cluster can keep running, albeit with degraded

perfor-mance, even if some processors are failed That is, the probability that the cluster

continues to provide its service is close to 1 even if part of the cluster is failed

This property is called high availability A cluster with the proper hardware and

software combines high performance with high availability

In the last step, the computers are connected through a wide-area network

(WAN) such as the Internet This adds openness, in which independent

compu-tations or computers can find each other, connect, and collaborate meaningfully

Openness is the crucial difference between the world of high-performance

com-puting and the world of collaborative comcom-puting In addition to partial failure,

openness introduces two new issues: naming and security Naming is how

compu-tations or computers find each other Naming is usually supported by a special

Trang 6

●

● Threads, ports, cells, and variables are localized to home processes {a, b, , n, }

Threads

Values are not localized

(dataflow variables and values) (cells and ports)

(c2:Z) (X)

U=c2

Z=person(age: Y)

W=atom (V)

Figure 11.2: The distributed computation model

part of the system called the name server Security is how computations or

computers protect themselves from each other

11.2 The distribution model

We consider a computation model with both ports and cells, combining the els of Chapters 5 and 8 We refine this model to make the distribution model,which defines the network operations done for language entities when they areshared between Oz processes [71, 197, 72, 201, 73] If distribution is disregarded(i.e., we do not care how the computation is spread over processes) and thereare no failures, then the computation model of the language is the same as if itexecutes in one process

mod-We assume that any process can hold a reference to a language entity on anyother process Conceptually, there is a single global computation model that en-compasses all running Mozart processes and Mozart data world-wide (even thoseprograms that are not connected together!) The global store is the union of allthe local stores In the current implementation, connected Mozart processes pri-marily use TCP to communicate To a first approximation, all data and messagessent between processes travel through TCP

Figure 11.2 shows the computation model To add distribution to this globalview, the idea is that each language entity has a distribution behavior, which de-fines how distributed references to the entity interact In the model, we annotateeach language entity with a process, which is the “home process” of that entity It

is the process that coordinates the distribution behavior of the entity Typically,

it will be the process at which the entity was first created.1 We will sometimes

use the phrase consistency protocol to describe the distribution behavior of an

en-1In Mozart, the coordination of an entity can be explicitly moved from one process toanother This issue will not be discussed in this introductory chapter.

Trang 7

11.2 The distribution model 719

tity The distribution behavior is implemented by exchanging messages between

Mozart processes

What kinds of distribution behavior are important? To see this, we first

distinguish between stateful, stateless, and single-assignment language entities.

Each of them has a different distribution behavior:

• Stateful entities (threads, cells, ports, objects) have an internal state The

distribution behavior has to be careful to maintain a globally coherent view

of the state This puts major constraints on the kinds of efficient behavior

that are possible The simplest kind of behavior is to make them stationary.

An operation on a stationary entity will traverse the network from the

invoking process and be performed on the entity’s home process Other

kinds of behavior are possible

• Single-assignment entities (dataflow variables, streams) have one essential

operation, namely binding Binding a dataflow variable will bind all its

distributed references to the same value This operation is coordinated

from the process on which the variable is created

• Stateless entities, i.e., values (procedures, functions, records, classes,

func-tors) do not need a process annotation because they are constants They

can be copied between processes.

Figure 11.3 shows a set of processes with localized threads, cells, and unbound

dataflow variables In the stateful concurrent model, the other entities can be

defined in terms of these and procedure values These basic entities have a default

distributed behavior But this behavior can be changed without changing the

language semantics For example, a remote operation on a cell could force the

cell to migrate to the calling process, and thereafter perform the operation locally

For all derived entities except for ports, the distributed behaviors of the

de-fined entities can be seen as derived behavior from the distributed behavior of

their parts In this respect ports are different The default distribution behavior

is asynchronous (see Section 5.1) This distributed behavior does not follow from

the definition of ports in terms of cells This behavior cannot be derived from

that of a cell This means that ports are basic entities in the distribution model,

just like cells

The model of this section is sufficient to express useful distributed programs,

but it has one limitation: partial failures are not taken into account In

Sec-tion 11.8 we will extend the basic model to overcome this limitaSec-tion

Depending on the application’s needs, entities may be given different

dis-tributed behaviors For example “mobile” objects (also known as “cached”

ob-jects) move to the process that is using them These objects have the same

language semantics but a different distributed hehavior This is important for

tuning network performance

Trang 8

Process b

S2 S3

S1 X

Threads

Cell

c1:X Dataflow variable

W Sx Y

Figure 11.3: Process-oriented view of the distribution model

11.3 Distribution of declarative data

Let us show how to program with the distribution model In this section we showhow distribution works for the declarative subset of the stateful concurrent model

We start by explaining how to get different processes to talk to each other

We say a distributed computation is open if a process can connect independently

with other processes running a distributed computation at run time, withoutnecessarily knowing beforehand which process it may connect with nor the type

of information it may exchange A distributed computation is closed if it is

arranged so that a single process starts and then spawns other processes on variouscomputers it has access to We will talk about closed distribution later

An important issue in open distributed computing is naming How do

in-dependent computations avoid confusion when communicating with each other?They do so by using globally-unique names for things For example, instead

of using print representations (character strings) to name procedures, ports, orobjects, we use globally-unique names instead The uniqueness should be guar-anteed by the system There are many possible ways to name entities:

• References A reference is an unforgeable means to access any language

entity To programs, a reference is transparent, i.e., it is dereferenced whenneeded to access the entity References can be local, to an entity on thecurrent process, or remote, to an entity on a remote process For example,

a thread can reference an entity that is localized on another process Thelanguage does not distinguish local from remote references

• Names A name is an unforgeable constant that is used to implement

abstract data types Names can be used for different kinds of identityand authentication abilities (see Sections 3.7.5 and 6.4) All language en-tities with token equality, e.g., objects, classes, procedures, functors, etc.,

Trang 9

11.3 Distribution of declarative data 721

implement their identity by means of a name embedded inside them (see

Chapter 13)

• Tickets A ticket, in the terminology of this chapter, is a global means to

access any language entity A ticket is similar to a reference, except that it is

valid anywhere including outside a Mozart process It is represented by an

ASCII string, it is explicitly created and dereferenced, and it is forgeable A

computation can get a reference to an independent computation by getting

a ticket from that computation The ticket is communicated using any

communication protocol between the processes (e.g., TCP, IP, SMTP, etc.)

or between the users of these processes (e.g., sneakernet, telephone, PostIt

notes, etc.) Usually, these protocols can only pass simple datatypes, not

arbitrary language references But in almost all cases they support passing

information coded in ASCII form If they do, then they can pass a ticket

• URLs (Uniform Resource Locators) A URL is a global reference to

a file The file must be accessible by a World-Wide Web server A URL

encodes the hostname of a machine that has aWeb server and a file name on

that machine URLs are used to exchange persistent information between

processes A common technique is to store a ticket in a file addressed by

URL

Within a distributed computation, all these four kinds of names can be passed

be-tween processes References and names are pure names, i.e., they do not explicitly

encode any information other than being unique They can be used only inside a

distributed computation Tickets and URLs are impure names since they

explic-itly encode the information needed to dereference them–they are ASCII strings

and can be read as such Since they are encoded in ASCII, they can be used

both inside and outside a distributed computation In our case we will connect

different processes together using tickets.

Tickets are created and used with the Connection module This module has

three basic operations:

ticket can be taken just once Attempting to take a ticket more than once

will raise an exception

X The ticket can be taken any number of times

T TheXrefers to exactly the same language entity as the original reference

that was offered when the ticket was created A ticket can be taken at any

Trang 10

process If taken at a different process than where the ticket was offered,then network communication is initiated between the two processes.

extreme-ly simple The system does a great deal of work to give this simple view Itimplements the connection protocol, transparent marshaling and unmarshaling,distributed garbage collection, and a carefully-designed distribution protocol foreach language entity

This example creates the ticket withConnection.offerUnlimitedand displays

it in the Mozart emulator window (withShow) Any other process that wants toget a reference to X just has to know the ticket Here is what the other processdoes:

X2={Connection.take ´ ticket comes here ´}

(To make this work, you have to replace the text´ ticket comes here ´

by what was displayed by the first process.) That’s it The operationConnection.take

takes the ticket and returns a language reference, which we put in X2 Because

of network transparency, bothX and X2 behave identically

Sharing functions

This works for other data types as well Assume the first process has a functioninstead of a record:

fun {MyEncoder X} (X*4449+1234) mod 33667 end

{Show {Connection.offerUnlimited MyEncoder}}

The second process can get the function easily:

E2={Connection.take ´ MyEncoders ticket ´}

2Here, as in the subsequent examples, we leave out declare for brevity, but we keep

declare infor clarity.

Trang 11

11.3 Distribution of declarative data 723 Sharing dataflow variables

In addition to records and functions, the ticket can also be used to pass unbound

variables Any operation that needs the value will wait, even across the network

This is how we do distributed synchronization [71] The first process creates the

variable and makes it globally accessible:

The multiplication blocks untilX is bound Try doingX=111 in the first process

The binding will become visible on all processes that reference the variable

In the above examples we copied and pasted tickets between windows of the

interactive environment A better way to distribute tickets is to store them in a

file To successfully connect with the ticket, the destination process just has to

read the file This not only makes the distribution easier, it also can distribute

over a larger part of the network There are two basic ways:

• Local distribution This uses the local file system The destination

pro-cess has to be connected to the same file system as the source propro-cess This

works well on a LAN where all machines have access to the same file system

• Global distribution This uses the global Web infrastructure The file

can be put in a directory that is published by a Web server, e.g., in a

~/public_html directory on Unix The file can then be accessed by URL

Using a URL to make connections is a general approach that works well for

collaborative applications on the Internet To implement these techniques we

need an operation to store and load a ticket from a file or URL This is already

provided by the Pickle module as described in Chapter 2 Any stateless value

can be stored on a file and retrieved by other process that has a read access-right

to the file

can be a file name or a URL

Trang 12

Pickle can store any stateless entity For example, it can be used to storerecords, procedures, classes, and even functors, all of which are pure values Anattempt to save stateful data in a pickle will raise an exception An attempt tosave a partial value in a pickle will block until the partial value is complete Thefollowing code:

{Pickle.save MyEncoder ´˜/public_html/encoder´}

saves the functionMyEncoderin a file Files in the˜/public_htmldirectory areoften publicly-accessible by means of URLs Anyone who needs the MyEncoder

function can just load the file by giving the right URL:

MyEnc={Pickle.load ´http://www.info.ucl.ac.be/˜pvr/encoder´}

There is no way to distinguish them The ability to store, transfer, and then cute procedure values across a network is the essential property of what nowadays

exe-is known as “applets” A value saved in a pickle continues to be valid even ifthe process that did the save is no longer running The pickle itself is a filethat contains complete information about the value This file can be copied andtransferred at will It will continue to give the same value when loaded with

The main limitation of pickles is that only values can be saved One way

to get around this is to make a snapshot of stateful information, i.e., make a

stateless data structure that contains the instantaneous states of all relevantstateful entities This is more complex than pickling since the stateful entitiesmust be locked and unlocked, and situations such as deadlock must be avoided.However, we do not need such a complete solution in this case There is a

simple technique for getting around this limitation that works for any language

entity, as long as the process that did the save is still running The idea is tostore a ticket in a pickle This works since tickets are strings, which are values

This is a useful technique for making any language entity accessible worldwide.

The URL is the entry point for the entity addressed by the ticket

de-fine two convenience operations Offerand Takethat implement this technique.These operations are available in the moduleDistribution, which can be found

on the book’s Web site The procedureOffer makes language entity X availablethrough fileFN:

proc {Offer X FN}

{Pickle.save {Connection.offerUnlimited X} FN}

end

The functionTakegets a reference to the language entity by giving the fileFNURL:

fun {Take FNURL}

{Connection.take {Pickle.load FNURL}}

end

Trang 13

11.3 Distribution of declarative data 725

This uses Pickle.load, which can load any stateless data from a file The

argument FNURL can either be a file name or a URL

Declarative programming with streams, as in Chapter 4, can be made distributed

simply by starting the producer and consumer in different processes They only

have to share a reference to the stream

Eager stream communication

Let us first see how this works with an eager stream First create the consumer

in one process and create a ticket for its stream:

Then create the producer in another process It takes the ticket to get a reference

to Xsand then creates the stream:

declare Xs Generate in

Xs={Take tickfile}

fun {Generate N Limit}

if N<Limit then N|{Generate N+1 Limit} else nil end

end

Xs={Generate 0 150000}

This creates the stream 0|1|2|3| and binds it to Xs This sends the stream

across the network from the producer process to the consumer process This

is efficient since stream elements are sent asynchronously across the network

Because of thread scheduling, the stream is created in “batches” and each batch

is sent across the network in one message

Lazy stream communication

We can run the same example with lazy stream communication Take the

exam-ples with programmed triggers (Section 4.3.3) and implicit triggers (Section 4.5)

and run them in different processes, as we showed above with eager streams

Message passing and ports

A port is a basic data type; it is a FIFO channel with an asynchronous send

operation Asynchronous means that the send operation completes immediately,

without waiting for the network FIFO means that successive sends in the same

Trang 14

thread will appear in the same order in the channel’s stream Ports generalizestreams by allowing many-to-one communication It is the difference betweenthe declarative concurrent model and the message-passing model Let us create

a port, make it globally accessible, and display the stream contents locally:

declare S P in

{NewPort S P}

{Offer P tickfile}

for X in S do {Browse X} end

Theforloop causes dataflow synchronization to take place on elements appearing

on the stream S Each time a new element appears, an iteration of the loop isdone Now we can let a second process send to the port:

P={Take tickfile}

{Send P hello}

{Send P ´keep in touch´}

Since the Sendoperation is asynchronous, it sends just a single message on thenetwork

11.4 Distribution of state

Stateful entities are more expensive to distribute than stateless entities in anetwork-transparent system It is because changes in state have to be visible

to all processes that use the entity

Sharing cells

The simplest way to share state between processes is by sharing a cell This can

be done exactly in the same way as for the other types

is important for efficiency (e.g., network hops) We will see later how the globalcoherence is maintained and how the programmer can control what the networkoperations are

Trang 15

Now that we know the distributed behavior of cells, let us it to implement a

well-known distributed locking algorithm, also well-known as distributed mutual exclusion

using token passing.3 When locking a critical section, multiple requests should

all correctly block and be queued, independent of whether the threads are on the

same process or on another process We show how to implement this concisely

and efficiently in the language Figure 11.4, taken from Section 8.3, shows one

way to implement a lock that handles exceptions correctly.4 If multiple threads

attempt to access the lock body, then only one is given access, and the others are

queued The queue is a sequence of dataflow variables Each thread suspends on

one variable in the sequence, and will bind the next variable after it has executed

the lock body Each thread desiring the lock therefore references two variables:

one to wait for the lock and one to pass the lock to the next thread Each variable

is referenced by two threads

When the threads are on different processes, the definition of Figure 11.4

implements distributed token passing, a well-known distributed locking

algo-rithm [33] We explain how it works When a thread tries to enter the lock

body, the Exchange gives it access to the previous thread’s New variable The

previous thread’s process is New’s owner When the previous thread binds New,

the owner sends the binding to the next thread’s process This requires a single

message

Sharing objects and other data types

A more sophisticated way to share state is to share objects In this way, we

encapsulate the shared state and control what the possible operations on it are

Here is an example:

class Coder

3The built-in distributed locks of Mozart use this algorithm.

4For simplicity, we leave out reentrancy since it does only local execution.

Trang 16

attr seed meth init(S) seed:=S end meth get(X)

X=@seed

seed:=(@seed*1234+4449) mod 33667

end end

C={New Coder init(100)}

{Offer C tickfile}

This defines the classCoderand an objectC Any process that takes the object’sticket will reference it The Mozart system guarantees that the object will behave

exactly like a centralized object For example, if the object raises an exception,

then the exception will be raised in the thread calling the object

One of the important properties of network transparency is distributed lexical scoping: a procedure value that references a language entity will continue to

reference that entity, independent of where the procedure value is transferredacross the network This causes remote references to be created implicitly, bythe simple act of copying the procedure value from one process to another Forexample:

declare

C={NewCell 0}

fun {Inc X} X+@C end

{Offer C tickfile1}

{Offer Inc tickfile2}

Inc will always reference C, no matter from which process it is called A thirdprocess can takeC’s ticket and change the content This will change the behavior

of Inc The following scenario can happen: (1) process 1 defines Cand Inc, (2)process 2 gets a reference to Inc and calls it, and (3) process 3 gets a reference

toCand changes its content When process 2 calls Incagain it will use the newcontent ofC Semantically, this behavior is nothing special: it is a consequence ofusing procedure values with network transparency But how is it implemented?

In particular, what network operations are done to guarantee it? We would likethe network behavior to be simple and predictable Fortunately, we can designthe system so that this is indeed the case, as Section 11.7 explains

Distributed lexical scoping, like lexical scoping, is important for reasons ofsoftware engineering Entities that are moved or copied between processes willcontinue to behave according to their specification, and not inadvertently changebecause some local entity happens to have the same name as one of their externalreferences The importance of distributed lexical scoping was first recognized byLuca Cardelli and realized in Obliq, a simple functional language with object-based extensions that was designed for experimenting with network-transparent

Trang 17

11.5 Network awareness 729

distribution [28]

11.5 Network awareness

With these examples it should be clear how network transparency works Before

going further, let us give some insight into the distributed behavior of the various

language entities The distributed implementation of Mozart does many different

things It uses a wide variety of distributed algorithms to provide the illusion of

network transparency (e.g., see Section 11.7) At this point, you may be getting

uneasy about all this activity going on behind the scenes Can it be understood,

and controlled if need be, to do exactly what we want it to do? There are in fact

two related questions:

• What network operations does the system do to implement transparency?

• Are the network operations simple and predictable, i.e., is it possible to

build applications that communicate in predictable ways?

As we will see, the network operations are both few and predictable; in most

cases exactly what would be achieved by explicit message passing This property

of the distribution subsystem is called network awareness.

We now give a quick summary of the network operations done by the

most-used language entities; later on in Section 11.7 we will define the distributed

algorithms that implement them in more detail.5 The basic idea is the following:

stateless entities are copied, bindings of dataflow variables are multicast, ports are

stationary with FIFO sends, and other stateful entities use a consistency protocol.

Here is a more detailed description:

• Numbers, records, and literals are stateless entities with structure

equal-ity That is, separately-created copies cannot be distinguished The values

are copied immediately whenever a source process sends a message to a

target process This takes one network hop Many copies can exist on the

target process, one for each message sent by the source process

• Procedures, functions, classes, and functors are stateless entities with

token equality That is, each entity when created comes with a globally

unique name The default protocol needs one message possibly followed by

one additional round trip In the first message, just the name is sent If this

name already exists in the target process, then the value is already present

and the protocol is finished If not, then the value is immediately requested

with a round trip This means that at most one copy of these entities can

exist on each process

5The distribution behavior we give here is the default behavior It is possible to change this

default behavior by explicit commands to the Mozart implementation; we will not address this

issue in this chapter.

Trang 18

• Dataflow variables When binding the variable, one message is sent to

the process coordinating the variable protocol This process then multicastsone message to each process that references the variable This means thatthe coordinating process implicitly knows all processes that reference thevariable

• Objects, cells, and locks Objects are a particular case of distributed

state There are many ways to implement distributed state in a network

transparent way This chapter singles out three protocols in particular:mobile cached objects (the default), stationary objects, and asynchronousobjects Each protocol gives good network behavior for a particular pattern

of use The Coder example we gave before defines a mobile cached object

A mobile cached object is moved to each process that uses it This requires

a maximum of three messages for each object move Once the object hasmoved to a process, no further messages are needed for invocations in thatprocess Later on, we will redo theCoderexample with a stationary object(which behaves like a server) and an asynchronous object (which allowsmessage streaming)

• Streams A stream is a list whose tail is an unbound variable Sending

on a stream means to add elements by binding the tail Receiving meansreading the stream’s content Sending on a stream will send stream elementsasynchronously from a producer process to a consumer processes Streamelements are batched when possible for best network performance

• Ports Sending to a port is both asynchronous and FIFO Each element

sent causes one message to be sent to the port’s home process This kind ofsend is not possible with RMI, but it is important to have: in many cases,one can send things without having to wait for a result

It is clear that the distributed behavior of these entities is both simple and defined To a first approximation, we recommend that a developer just ignore itand assume that the system is being essentially as efficient as a human program-mer doing explicit message passing There are no hidden inefficiencies

well-11.6 Common distributed programming patterns

In the Coder example given above, the object is mobile (i.e., cached) Thisgives good performance when the object is shared between processes, e.g., in acollaborative application The object behaves like a cache On the other hand, it

is often useful to have an object that does not move, i.e., a stationary object For

example, the object might depend on some external resources that are localized

to a particular process Putting the object on that process can give orders of

Trang 19

11.6 Common distributed programming patterns 731

magnitude better performance than putting it elsewhere A stationary object is

a good way to define a server, since servers often use localized resources

Whether or not an object is mobile or stationary is defined independently of

the object’s class It is defined when the object is created Using New creates a

mobile cached object and usingNewStatcreates a stationary object The system

supportsNewStatin the implementation as a basic primitive, e.g., all objects that

are system resources (such as file descriptor objects) are stationary NewStathas

actually already been defined, in Chapter 7 when we introduce active objects We

give the definition again, because we will use it to understand the distribution

behavior:

fun {NewStat Class Init}

P Obj={New Class Init} in

end

proc {$ M}

X in

{Send P M#X}

case X of normal then skip

[] exception(E) then raise E end end

end

Let us see how the distributed behavior of NewStat is derived from the default

behavior of its component parts Let us make a stationary version of the Coder

object:

C={NewStat Coder init(100)}

{Offer C tickfile}

This creates a thread and a port situated on the home process C is a reference

to a one-argument procedure Now assume another process gets a reference toC:

C2={Take tickfile}

This will transfer the procedure to the second process That is, the second process

now has a copy of the procedure C, which it references byC2 Now let the second

process call C2:

local A in

{C2 get(A)} {Browse A}

end

This creates a dataflow variableAand then callsC2locally This does a port send

{Send get(A)#X} The references to AandXare transferred to the first process

At the home process, {Obj get(A)}is executed, whereObjis the actualCoder

Trang 20

object If this execution is successful, then Ais bound to a result andXis bound

execution would have raised an exceptionE, then it would be transferred back asthe tuple exception(E)

What we have described is a general algorithm for remote method invocation

of a stationary object For example, the object on the home process, while serving

a request, could call another object at a third process, and so on Exceptions will

be passed correctly

We can also see that remote calls to stationary objects are more complexthan calls to mobile cached objects! Mobile objects are simpler, because oncethe object arrives at the caller process, the subsequent execution will be local(including any raised exceptions)

How many network hops does it take to call a remote stationary object? Oncethe caller has a reference to the object, it takes two hops: one to send the tuple

M#Xand one to send the results back We can see the difference in performancebetween mobile and stationary objects Do the following in a second process,preferably on another machine:

We have seen how to share an object among several processes Because of networktransparency, these objects are synchronous That is, each object call waits untilthe method completes before continuing Both mobile and stationary objects aresynchronous Calling a stationary object requires two messages: first from thecaller to the object, and then from the object to the caller If the network is slow,this can take long

One way to get around the network delay is to do an asynchronous object

call First do the send without waiting for the result Then wait for the resultlater when it is needed This technique works very well together with dataflowvariables The result of the send is an unbound variable that is automaticallybound as soon as its value is known Because of dataflow, the variable can bepassed around, stored in data structures, used in calculations, etc., even beforeits value is known Let us see how this works with theCoderobject First, create

an asynchronous object:

C={NewActive Coder init(100)}

{Offer C tickfile}

Định dạng
Số trang	41
Dung lượng	241,49 KB