1. Trang chủ
  2. » Kỹ Thuật - Công Nghệ

Tài liệu Grid Computing P25 pptx

13 274 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Ninf-G: a GridRPC system on the Globus Toolkit
Tác giả Hidemoto Nakada, Yoshio Tanaka, Satoshi Matsuoka, Satoshi Sekiguchi
Trường học National Institute of Advanced Industrial Science and Technology (AIST)
Chuyên ngành Grid Computing
Thể loại Chapter
Năm xuất bản 2003
Định dạng
Số trang 13
Dung lượng 113,27 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Client program basically invokes Remote Executables to request computation to be done on the server; asynchronous parallel executions of multiple Remote Executables on different servers

Trang 1

Ninf-G: a GridRPC system on the

Globus toolkit

Hidemoto Nakada,1,2 Yoshio Tanaka,1 Satoshi Matsuoka,2

and Satoshi Sekiguchi1

1National Institute of Advanced Industrial Science and Technology (AIST), Grid Technology Research Center, Tsukuba, Ibaraki, Japan,2Tokyo Institute of Technology,

Global Scientific Information and Computing Center, Tokyo, Japan

25.1 INTRODUCTION

Recent developments in high-speed networking enables collective use of globally dis-tributed computing resources as a huge single problem-solving environment, also known

as the Grid

The Grid not only presents a new, more difficult degree of inherent challenges in distributed computing such as heterogeneity, security, and instability, but will also require the constituent software substrates to be seamlessly interoperable across the network

As such, software layers are constructed so that higher-level middleware sits on some common, lower-level software layer, just as it is with single-box computers in which most applications running on top of it share a common operating system and standard libraries

Grid Computing – Making the Global Infrastructure a Reality. Edited by F Berman, A Hey and G Fox

 2003 John Wiley & Sons, Ltd ISBN: 0-470-85319-0

Trang 2

Currently, the Globus Toolkit [1] serves as the ‘de facto standard’ lower-level substrate

for the Grid Globus provides important Grid features such as authentication, authoriza-tion, secure communicaauthoriza-tion, directory services, and so on that the software above can utilize so as not to replicate the lower-level programming efforts, as well as provide interoperability between the middleware mentioned above However, the Globus Toolkit alone is insufficient for programming on the Grid The abstraction level of the Globus, being a lower-level substrate, is terse and primitive in a sense; this is fine for the intended use of the Toolkit, but nevertheless higher-level programming layers would be absolutely necessary for most users This is analogous to the situation in programming, say, parallel programming on a Linux cluster; one would certainly program with higher-level parallel programming systems such as MPI and OpenMP, rather than by using the raw TCP/IP socket interface of the operating system

Over the years there have been several active research programs in programming mid-dlewares for the Grid environment, including MPICH-G [2] and Nimrod/G [3], Ninf [4], and NetSolve [5] MPICH-G and MPICH-G2 are MPI systems achieving security and cross-organization MPI interoperability using Globus Toolkit, although as a programming model for the Grid the utility is somewhat less significant compared to tightly coupled environments Nimrod-G is a high-throughput computing system implemented on Globus toolkit It automatically invokes existing applications on many computing resources based

on deadline scheduling While the system is quite useful for some applications, the applica-tions area is largely limited to parameter-sweep applicaapplica-tions and not general task-parallel programming

Ninf and NetSolve are implementations of the GridRPC [6] programming model, pro-viding simple yet powerful server-client-based framework for programming on the Grid GridRPC facilitates an easy-to-use set of APIs, allowing easy construction of globally dis-tributed parallel applications without complex learning curves Both systems have seen successful usages in various Grid application projects

On the other hand, Ninf and NetSolve themselves are mutually interoperable, but they

do not interoperate well with other Grid tools on Globus such as GridFTP The reason

is that, because the first versions of both Ninf and NetSolve were essentially developed

at the same time as Globus, both systems were built rather independently without fully utilizing the Globus features or taking into account their mutual interoperability

To resolve the situation, we redesigned the whole GridRPC system in collaboration with the NetSolve team at the University of Tennessee, Knoxville The redesign has been extensive, including the software architecture to fully utilize the Globus features, user-level API specifications that retain the simplicity but generalize call contexts for more flexible Grid-level programming, various changes in the protocols, and so on The redesign, especially the API specification, has been carefully done so that each group can independently produce respective implementations of GridRPC

Our implementation is Ninf-G, which is in effect a full reimplementation of Ninf on top

of Globus Compared to older implementations, Ninf-G fully utilizes and effectively sits on top of Globus The result is that applications constructed with Ninf-G can take advantage

of any modules implemented for the Globus Toolkit, such as special job managers or GridFTP servers

Trang 3

For the remainder of this chapter, we discuss the implementation of the Ninf-G system, and demonstrate the usage with a simple parallel programming example with the Ninf-G, along with its performance In Section 25.2 we briefly introduce the Globus Toolkit; in Section 25.3 we discuss the overall design of GridRPC, and how each feature can be mapped on to lower-level Globus Toolkit features Section 25.4 outlines the implementa-tion of the Ninf-G, while Secimplementa-tions 25.5 and 25.6 illustrate a typical usage scenario along with its performance evaluation Section 25.7 will conclude and hint at future directions

25.2 GLOBUS TOOLKIT

The Globus Toolkit is a collection of modules that provides standardized lower-level features for implementing a distributed system on the Grid Table 25.1 covers the services provided by Globus, in which each service can be used independently when needed We give brief descriptions of the most relevant modules for GridRPC

GSI : Grid Security Infrastructure (GSI) serves as the common authentication facility

underlying all the features of Globus It enables single sign-on using certificate dele-gation based on PKI

GRAM : Globus Resource Allocation Manager (GRAM) is a ‘secure inetd’ that

authenti-cates clients using GSI-based certifiauthenti-cates, maps to the local user account, and invokes executable files

MDS : Monitoring and Discovering Service (MDS) is a directory service that provides

resource information within the Grid MDS consists of two layers of Lightweight Direc-tory Access Protocol (LDAP) servers, Grid Index Information Service (GIIS), which manages project-wide information, and Grid Resource Information Service (GRIS), which is responsible for site local information

Globus-I/O : Globus-I/O enables secure communication between Grid peers using GSI,

providing standard read/write APIs, plus nonblocking I/O that integrates with the Globus Threads library

GASS : Global Access to Secondary Storage (GASS) provides easy-to-use file transfer

facility across Grid peers GASS can be used in conjunction with GRAM to stage client side files

Table 25.1 Globus services

Information infrastructure MDS/GRIS/GIIS Information service

Trang 4

25.3 DESIGN OF NINF-G

25.3.1 GridRPC system

GridRPC is a programming model based on client-server remote procedure call (RPC), with features added to allow easy programming and maintenance of code for scientific applications on the Grid Application programmers write task-parallel client programs using simple and intuitive GridRPC APIs that hide most of the complexities involving Grid programming As a result, programmers lacking experience on parallel programming, let alone the Grid, could still construct Grid applications effortlessly

At the server side, computation routines are encapsulated into an executable component

called the Remote Executable Client program basically invokes Remote Executables to

request computation to be done on the server; asynchronous parallel executions of multiple Remote Executables on different servers result in simple fork-join parallel execution, while nested GridRPC invocations with complex synchronizations resulting in generalized task-parallel computations are also possible if the programmers have control over the Remote Executables on all the constituent nodes

GridRPC system generally consists of the following four components:

Client component : Client components are programs that issue requests for GridRPC

invo-cations Each component consists of the user’s main program and the GridRPC library

Server component : Server component invokes Remote Executables as described below, Remote executable: Remote Executables perform the actual computation at the Server.

Each component consists of user-supplied server-side compute routine, system gener-ated stub main program, and system supplied communication library

Information service: Information service provides various information for the client

com-ponent to invoke and to communicate with the Remote Executable comcom-ponent

25.3.2 Ninf-G implementation on the Globus toolkit

As mentioned earlier, in contrast to previous incarnations of Ninf in which the Globus interface was added as an afterthought, various features of Ninf-G integrates Globus com-ponents directly to implement the GridRPC features More specifically, Ninf-G employs the following components from the Globus Toolkit as shown in Figure 25.1

GRAM : Serves the role of the server in the old Ninf system.

MDS : Publishes interface information and pathname of GridRPC components.

I/O : Client and remote executable communicate with each other using

Globus-I/O

GASS : Redirects stdout and stderr of the GridRPC component to the client console.

25.3.3 API of Ninf-G

Ninf-G has two categories of APIs One is the Ninf-compatible ‘legacy’ API, and the other is the new (low-level) GridRPC API that has been subjected to collaborative stan-dardization activities at the Global Grid Forum with the NetSolve team from UTK Ninf-G

Trang 5

Client Server

Client

MDS GRAM

IDL FILE

Numerical library

IDL compiler

Generate

Fork

Retrieve

4 Connect back using Globus I/O

3 Invoke executable

2 Interface reply

1 Interface request

Remote library executable Interface information LDIF File

Figure 25.1 Overview of Ninf-G.

Table 25.2 GridRPC API principal functions

int grpc − function − handle − init (

grpc − function − handle − t * handle,

char * host − name,

int port,

char * func − name);

Initializes function handle using the provided information.

int grpc − call (

grpc − function − handle − t *,

.);

Performs a blocking RPC with the function handle and arguments This call blocks until RPC completes.

int grpc − call − async (

grpc − function − handle − t *,

.);

Performs a nonblocking(asynchronous) RPC with the function handle and arguments Returns sessionID as a future reference to the session.

int grpc − wait(int sessionID); Wait for completion of the session specified

by the sessionID.

int grpc − wait − any (

int * idPtr);

Wait for completion of one of the RPCs invoked by grpc−call−async beforehand.

serves as one of the reference implementations of the GridRPC API Principal functions

of GridRPC API are shown in Table 25.2

25.3.4 Server side IDL

In order to ‘gridify’ a library, the Ninf library provider describes the interface of the library function using the Ninf IDL to publish his library function, which is only manifested and handled at the server side The Ninf IDL supports datatypes mainly tailored for

Trang 6

Module sample;

Define mmul(IN int N,

IN double A[N*N],

IN double B[N*N], OUT double C[N*N]) Required "mmul_lib.o"

Calls "C" mmul(N, A, B, C);

Figure 25.2 An example of Ninf IDL file.

serving numerical applications: for example, the basic datatypes are largely scalars and their multidimensional arrays On the other hand, there are special provisions such as support for expressions involving input arguments to compute array sizes, designation of temporary array arguments that need to be allocated on the server side but not transferred, and so on

This allows direct ‘gridifying’ of existing libraries that assumes array arguments to be passed by call-by-reference (thus requiring shared-memory support across nodes via the software), and supplementing the information lacking in the C and Fortran-type systems regarding array sizes, array stride usage, array sections, and so on

As an example, interface description for the matrix multiply is shown in Figure 25.2, in which the access specifiersINandOUT specify whether the argument is read or written within the Gridified library Other IN arguments can specify array sizes, strides, and

so on with size expressions In this example, the value of N is referenced to calculate the size of the array arguments A, B, C In addition to the interface definition of the library function, the IDL description contains the information needed to compile and link the necessary libraries Ninf-G tools allow the IDL files to be compiled into stub main routines and makefiles, which automates compilation, linkage and registration of Gridified executables

25.4 NINF-G IMPLEMENTATION

We now describe the implementation of the Ninf-G in more detail

25.4.1 ‘Gridifying’ a library or an application using GridRPC

Using Ninf-G, a user needs to merely take a few simple steps to make his application

‘Gridified’ in the following manner on the server side (we note again that no IDL handling

is necessary on the client side, as opposed to traditional RPC systems such as CORBA):

1 Describe the interface of a library function or an application with Ninf IDL

2 Process the Ninf IDL description file, generate stub main routine for the remote exe-cutable and a makefile as described above

3 Link the stub main routine with the remote library, obtain the ‘Gridified’ executable

Trang 7

4 Register the ‘Gridified’ executable into the MDS (Steps 3 and 4 are automated by the makefile)

To register information into the MDS, the program that acts as the information provider outputs data complying with the LDAP Data Interchange Format (LDIF); moreover, the program itself is registered within the MDS setup file Ninf-G places all such relevant

and registers a filter program that performs appropriate filtering as described below For this purpose, Ninf-G adds the lines as shown in Figure 25.3 into the ‘grid-info-resource-ldif.conf’ file

25.4.1.1 Generating the LDIF file

The Ninf-G IDL compiler also generates interface information source files, which are utilized upon ‘make’ to automatically generate interface information files in XML as well as in LDIF formats Both embody interface information (in XML), pathname of the remote executable, and the signature of the remote library

Figure 25.4 shows a sample LDIF file generated from an IDL file in Figure 25.2 (Note that the XML-represented interface information is base64 encoded because of its length.) Here, ‘ ROOT DN ’ in the first line will be replaced with the proper root-distinguished name by the information provider filter program described above

25.4.1.2 Registration into the MDS

In order to register the LDIF files generated in the current directory into the MDS, we

this step is also automated by the makefile

dn: Mds–Software–deployment=GridRPC–Ninf–G, Mds–Host–hn=brain–n.a02.aist.go.jp, \ Mds–Vo–name=local,o=grid

objectclass: GlobusTop

objectclass: GlobusActiveObject

objectclass: GlobusActiveSearch

type: exec

path: /usr/local/globus/var/gridrpc

base: catldif

args: –devclassobj –devobjs \

–dn Mds–Host–hn=brain–n.a02.aist.go.jp, Mds–Vo–name=local,o=grid \

–validto–secs 900 –keepto–secs 900

cachetime: 60

timelimit: 20

sizelimit: 10

Figure 25.3 Addition to the grid-info-resource-ldif.conf.

Trang 8

dn: GridRPC–Funcname=sample/mmul, Mds–Software–deployment=GridRPC–Ninf–G, _ _ROOT_DN_ _ objectClass: GlobusSoftware

objectClass: MdsSoftware

objectClass: GridRPCEntry

Mds-Software-deployment: GridRPC–Ninf–G

GridRPC–Funcname: sample/mmul

GridRPC–Module: sample

GridRPC–Entry: mmul

GridRPC–Path: /home/ninf/tests/sample/_stub_mmul

GridRPC–Stub:: PGZ1bmN0aW9uICB2ZXJzaW9uPSIyMjEuMDAwMDAwMDAwIiA+PGZ1bmN0aW9uX2

PSJwZXJmIiBlbnRyeT0icGluZ3BvbmciIC8+IDxhcmcgZGF0YV90eXBlPSJpbnQiIGlvZGVf

··· (The rest is omitted)

Figure 25.4 LDIF file for matrix multiply.

25.4.2 Performing GridRPC

Now we are ready to make the actual Ninf-G GridRPC call that can be broken down into the steps as shown in Figure 25.1

1 The client requests interface information and executable pathname from the MDS

2 MDS sends back the requested information

3 Client requests the GRAM gatekeeper to invoke the ‘Gridified’ remote executable, passing on the necessary information described below

4 The remote executable connects back to the client using Globus-I/O for subsequent parameter transfer, and so on

25.4.2.1 Retrieval of interface information and executable pathname

The client retrieves the interface information and executable pathname registered within the MDS using the library signature as a key The retrieved information is cached in the client program to reduce the MDS retrieval overhead

25.4.2.2 Invoking the remote executable

The client invokes the remote executable (done by Ninf-G via the Globus GRAM), spec-ifying the remote executable path obtained from the MDS and a port address that accepts the callback from the remote executable Here, the accepting port authenticates its peer using Globus-I/O, preventing malicious third-party attacks; this is because the party that owns the proper Globus proxy certificates derived from the client user certificate can connect to the port

25.4.2.3 Remote executable callbacks to the client

The remote executable obtains the client address and the port from argument list and con-nects back to the client using Globus-I/O Subsequent remote executable communication with the client will use this port

Trang 9

25.5 USAGE SCENARIO

Here, we show a sample deployment of a distributed parallel application using

Ninf-G As a sample application, consider computing an approximation of the value of pi

using a simple Monte Carlo method.1 The Monte Carlo method generates a huge amount

of random numbers, converts them into meaningful input parameters, performs some calculations using the parameters, and statistically processes the results of the calculation

to arrive at a meaningful result Because each computation will largely be independent as

a result of independent processing on each randomly generated number, Monte-Carlo is known to be quite suitable for distributed computing

For this example, we assume an environment shown in Figure 25.5 There are four Ninf-G server hosts named Host0, Host1, Host2, and Host3 that run Globus GRAM A GIIS server runs on HostIS All the GRIS servers on others hosts register themselves with the GIIS The Client Program looks up the GIIS server as an MDS server, and performs GridRPC onto the GRAM server on each host

25.5.1 Setup remote executable

To setup the servers, the Monte Carlo calculation is defined as a standard C function, and an IDL description of the function is also defined Figures 25.6 and 25.7 show the function and its IDL description, respectively The function receives a seed for random number generation and a number of Monte Carlo trials, and returns the count of how many points have fallen within the circle

25.5.2 Client program

The client program performs GridRPCs onto the servers in parallel using the asynchronous invocation API The core fragment of the client program is shown in Figure 25.8, demon-strating the relative ease of parallel application construction in Ninf-G

Lookup Ninf-G Client

GIIS

Register

Invoke

GRAM GRIS GRAM GRIS GRAM GRIS GRAM

GRIS Host0 HostlS

Host1

Host2

Host3

Figure 25.5 Usage scenario environment.

1 This is one of the simplest Monte Carlo applications Define a circle and a square that encompasses the circle A randomly generated point in the square would fall inside the circle with probability ‘π

’ In reality, problems are much more complex

Trang 10

long pi_trial(int seed, long time){

long 1, counter = 0;

srandom(seed);

for (1 = 0; 1 < times; 1++){

double x = (double)random() / RAND_MAX;

double y = (double)random() / RAND_MAX;

if (x * x + y * y < 1.0) counter++;

} return counter;

}

Figure 25.6 Monte Carlo PI trials.

Module pi;

Define pi_trial (IN int seed, IN long times,

OUT long * count)

"monte carlo pi computation"

Required "pi_trial.o"

{ long counter;

counter = pi_trial(seed, times);

*count = counter;

}

Figure 25.7 IDL for the PI trial function.

/* Initialize handles */

for (i = 0; i < NUM_HOSTS; i++) {

if (grpc_function_handle_init(&handles[i],

hosts[i], port, "pi/pi_trial")

== GRPC_ERROR){

grpc_perror("handle_init");

exit(2);

} } for (i = 0; i < NUM_HOSTS; i++){

/* (Parallel non-blocking remote function invocation */

if ((ids[i] = grpc_call_async(&handles[i], i, times, &count[i])) == GRPC_ERROR){

grpc_perror("pi_trial");

exit(2);

} } /* Synchronize on the result return */

if (grpc_wait_all() == GRPC_ERROR){

grpc_perro("wait_all");

exit(2);

}

Figure 25.8 Client PI program.

Ngày đăng: 24/12/2013, 13:16

TỪ KHÓA LIÊN QUAN

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

w