Integrated Research in GRID Computing- P14 potx

of Computer Science Vrije Universiteit Amsterdam, The Netherlands kielmann@cs.vu.nl gosia@cs.vu.nl Natalia Currle-Linde and Michael Resch High Performance Computing Center HLRS Uni

Trang 1

6 Conclusions, related and future work

The paper presented an initial solution for the integration of P-GRADE portal

and GRID superscalar The solution is based on the generation of a GRID

superscalar application from a P-GRADE workflow The GS deployment center

is also used to automatically deploy the application in the local and server hosts

Concerning the future work, the prototype must be finalized, and then the

addition of conditional and loop constructs, and support for parameter study

applications at workflow level can be started in order to get high-level control

mechanisms, similar to UNICORE [13]

Therefore, we will get closer a new toolset that can assist to system

admin-istrators, programmers, and end-users at each stage of software development,

deployment and usage of complex workflow based applications on the Grid

The integrated GRID superscalar - P-GRADE Portal system shows many

similarities with the GEMLCA [12] architecture The aim of GEMLCA is

to make pre-deployed, legacy applications available as unified Grid services

Using the GS deployment center, components of P-GRADE Portal workflows

can be published in the Grid for execution as well However, while GEMLCA

expects compiled and already tested executables, GRID superscalar is capable

to publish components from source code

Acknowledgments

This word has been partially supported by NoE CoreGRID (FP6-004265) and

by the Ministry of Science and Technology of Spain under contract

TIN2004-07739-C02-01

References

[1] G Sipos, P Kacsuk Classification and Implementations of Workflow-Oriented Grid

Por-tals Proc of High Performance Computing and Communications (HPCC 2005), Lecture

Notes in Computer Science 3726, pp 684-693, 2005

[2] R Lovas, et al Application of P-GRADE Development Environment in Meteorology Proc

of DAPSYS'2002, Linz,, pp 30-37, 2002

[3] T Tannenbaum, D Wright, K Miller, M Livny Condor - A Distributed Job Scheduler

Beowulf Cluster Computing with Linux The MIT Press, MA, USA, 2002

[4] I Foster, C Kesselman Globus: A Toolkit-Based Grid Architecture In I Foster, C

Kessel-mann (eds.) The Grid: Blueprint for a New Computing Infrastructure, Morgan KaufKessel-mann,

1999, pp 259-278

[5] GRID superscalar Home Page, http://www.bsc.es/grid/

[6] R M Badia, J Labarta, R Sirvent, J M Perez, J M Cela, R Grima Programming Grid

Applications with GRID Superscalar Journal of Grid Computing, 1(2): 151-170, 2003

[7] R Raman, M Livny, M Solomon Matchmaking: Distributed Resource Management for

High Throughput Computing Proceedings of the Seventh IEEE International Symposium

on High Performance Distributed Computing, July 28-31, 1998, Chicago, IL

Trang 2

[8] I Foster, C Kesselman Globus: A Metacomputing Infrastructure Toolkit Int Journal of

Supercomputer Applications, 11(2): 115-12

[9] Y Tanaka, H Nakada, S Sekiguchi, T Suzumura, S Matsuoka Ninf-G: A Reference

Implementation of RPC-based Programming Middleware for Grid Computing Journal of

Grid Computing, 1(1):41-51, 2003

[10] uDraw(Graph) http://www.informatik.uni-bremen.de/davinci/

[11] PARAVER http://www.cepba.upc.edu/paraver/

[12] T Delaittre, T Kiss, A Goyeneche, G Terstyanszky, S.Winter, P Kacsuk GEMLCA:

"Running Legacy Code Applications as Grid Services" Journal of Grid Computing, Vol

3., No 1-2, pp 7 5 - 9 0 , 2005

[13] Dietmar W Erwin "UNICORE - A Grid Computing Environment" Concurrency and

Computation: Practice and Experience Vol 14, Grid Computing environments Special Issue 13-14,2002

[14] Jason Novotny, Michael Russell, Oliver Wehrens GridSphere: a portal framework for

building collaborations Concurrency and Computation: Practice and Experience, Volume

16, Issue 5 , pp 503-513, 2004

[15] Baude P., Baduel L., Caromel D., Contes A., Huet P., Morel M., Quilici R Programming,

Composing, Deploying for the Grid In "GRID COMPUTING: Software Environments

and Tools", Jose C Cunha and Omer F Rana (Eds), Springer Verlag, January 2006 [ 16] Rob V van Nieuwpoort, Jason Maassen, Gosia Wrzesinska, Rutger Hofman, Ceriel Jacobs,

Thilo Kielmann, Henri E Bal Ibis: a Flexible and Efficient Java-based Grid Programming

Environment Concurrency and Computation: Practice and Experience, Vol 17, No 7-8,

pp 1079-1107,2005

[17] N Furmento, A Mayer, S McGough, S Newhouse, T Field, J Darlington ICENI:

Optimisation of Component Applications within a Grid Environment Parallel Computing,

28(12), 2002

Trang 3

ENVIRONMENT: A CASE STUDY OF USING

MEDIATOR COMPONENTS

Thilo Kielmann and Gosia Wrzesinska

Dept of Computer Science

Vrije Universiteit

Amsterdam, The Netherlands

kielmann@cs.vu.nl

gosia@cs.vu.nl

Natalia Currle-Linde and Michael Resch

High Performance Computing Center (HLRS)

University of Stuttgart

Germany

linde@hlrs.de

resch@hlrs.de

Abstract The Science Experimental Grid Laboratory (SEGL) problem solving

environ-ment allows users to describe and execute complex parameter study workflows

in Grid environments Its current implementation provides much high-level func-tionality for executing complex parameter-study workflows Alternatively, using

a toolkit of mediator components that integrate system-component capabilities into application code would allow to build a system like SEGL from existing, more generally applicable components, simplifying its implementation and main-tenance In this paper, we present the given design of the SEGL PSE, analyze the provided functionality, and identify a set of mediator components that can generalize the functionality required by this challenging application category

Keywords: Grid component model, mediator components, SEGL

Trang 4

1 Introduction

The SEGL problem solving environment [9] allows end-user programming

of complex, computation-intensive simulation and modeling experiments for

science and engineering Experiments are complex workflows, consisting of

domain-specific or general purpose simulation codes, referred to as tasks For

each experiment, the tasks are invoked with input parameters, that are varied over given parameter spaces, together describing individual parameter studies

SEGL allows users to program so-called applications using a graphical user interface An application consists of several tasks, the control flow of their invocation, and the dataflow of input parameters and results For the

param-eters, the user can describe iterations for parameter sweeps; also, conditional dependencies on result values can be part of the control flow Using such a user application program, SEGL can execute the tasks, provide them with their respective input parameters, and collect the individual results in an experiment-specific database

SEGL's current implementation allows executing complex parameter study workflows, involving a GUI-based frontend, an execution engine that schedules and monitors the progress of the experiment, as well as a data base server using an experiment-specific schema By following this design, much high-level functionality has been implemented on top of existing Grid middleware, however in a way that is specific to SEGL

Alternatively, using a toolkit of mediator components that integrate system-component capabilities into application code would allow to build a system like SEGL from existing, more generally applicable components, simplifying its implementation and maintenance In this paper, we propose a redesign

of SEGL based on such mediator components Important insights are (a) the

necessity to integrate components with (legacy) Web-service based middleware,

and (b) the requirement of a persistent application-execution service

In the following, we revisit our view of component-based Grid application environments (Section 2), present SEGL's current architecture and functional-ity (Section 3), and identify a set of mediator components that can generalize the functionality required by this challenging application category (Section 4) Ongoing work related to the development of such mediator components is pre-sented in Section 5

2 Component-based Grid application environments

A technological vision is to build Grid software such that applications and middleware will be united to a single system of components [7] This can

be accomplished by designing a toolkit of components that mediate between both applications and system components The goal is to integrate system-component capabilities into application code, achieving both steering of the

Trang 5

application and performance adaptation by the application to achieve the most efficient execution on the available resources offered by the Grid

By introducing such a set of components, resources and services in the Grid get integrated into one overall system with homogeneous component interfaces The advantage of such a component system is that it abstracts from the many software architectures and technologies used underneath Both the strength and the challenge of such a component-based approach is that it provides a homogeneous set of well-defined (component-level) interfaces to and between all software systems in a Grid platform, ranging from portals and applications, via mediator components to the underlying middleware and system software

As outlined in [16], both components and Web services parallel traditional objects by encapsulating state from their clients behind well-defined interfaces They differ, however, in their applicability within given environments Ob-jects allow client/server communication within a single application process With components, client and server can be distributed across different pro-cesses, however, they have to share the same execution environment which is

the component model and one or more interoperable implementations of this

model Web services, finally, allow the distribution of client and server across different processes and execution environments, allowing the loosely-coupled integration of heterogeneous clients, resources, and services

Components are to be preferred over Web services as they provide higher execution performance, however, at the price of reduced interoperability Be-sides better performance, components also allow reflective behavior and re-composition of application software at run time, opening the path to fault-tolerant and behavior-adaptive Grid applications [8] The limitation to a single execution environment, however, contradicts the idea of Grid computing where interoperability plays a central role for the integration of independently created and maintained resources and services In consequence, we have to treat exist-ing Web-service based middleware as legacy systems that have to be integrated into a component-based Grid software platform

A possible rendering of the envisioned mediator components along with their embedding into a generic component platform is shown in Figure 1 This diagram is based on our previous work described in [6] Boxes in grey are examples of external services that are integrated into the overall platform

The upper part of Figure 1 outlines a component-based Grid application, where we distinguish between three layers The lowest layer, the runtime en-vironment, provides the interface of the application with external (Web-service based) resources and services The middle layer in the application stack con-sists of an extensible set of mediator components that provide higher-level functionality to the application The topmost layer consists of the application

components themselves, possibly enriched by a so-called Integrated Toolkit

Trang 6

\ Grid-unaware application

integrated toolltit

1 steering 1

component

steering

interface

tuning component

application manager

Grid-aware application

application-level information cache

runtime environment

[ security context |

i

f

PSE

user portal

f

resource

serv Ices

T

Information services

1

monitoring services

f

application

repository

Figure J Envisioned generic component platform

that provides Grid-unaware programming abstractions to the application In

the following, we present the envisioned components individually

Runtime Environment The runtime environment implements a set of

com-ponent interfaces to various kinds of Grid services and resources, like

job schedulers, file systems, etc It implements a delegation mechanism

that forwards invocations to service providers Doing so, the runtime

en-vironment provides an interface layer between application components

and both system components and middleware services Examples of such

runtime environments are the GAT [2], or GGF's SAGA [12] By

pro-viding dynamic bindings to the various service providers, the runtime

environment bridges the gap between components and services, and

al-lows to use system services with either type of interface, next to each

other at the same time

Security Context As the runtime environment implements the application's

interface to services and resources outside its own scope, care has to be

taken of authentication and authorization mechanisms each time external

entities are getting involved For this purpose, the security context forms

an integral part of the runtime environment

Steering Interface A dedicated part of the runtime environment is the steering

interface It is supposed to make applications accessible by system

enti-ties and user-interfaces (like portals or PSE's) like any other component

in the system This interface at the border of component-based

applica-tions and external services and components is supposed to relay to (and

Trang 7

protect) internal component interfaces Access control to the steering interface is subject to the security context

Application-level meta-data repository This repository is supposed to store

meta data about a specific application, storing, e.g., timing or resource requirements from previous, related runs The collected information will

be used by other components to support resource management (location and selection) and to optimize further runs of the applications automati-cally

Application-level information cache

This component is supposed to provide a unified interface to deliver all kinds of meta-data (e.g., from a Grid information service (GIS), a monitoring system, or from application-level meta data) to the applica-tion Its purpose is twofold First, it is supposed to provide a unifying component interface to all data (independent of its actual storage), in-cluding mechanisms for service and information discovery Second, this application-level cache is supposed to deliver the information really fast, cutting access times of current implementations like Globus GIS (up to multiple seconds) down to the order of a single method invocation

Steering Components Controlling and steering of applications by the user,

e.g., via application managers, user portals, and PSE's, requires a com-ponent level interface to give external entities access to the application From outside the application, the steering components will be accessible via the steering interface For example, we envision steering components with the following kinds of interfaces:

steering controller - for modifying application parameters

persistence controller - for externally triggering checkpoints

distribution strategy controller - for changing the data distribution

component explorer - for exploring (and modifying) the current

com-ponent composition

Tuning Components Tuning components can be used to optimize the

appli-cation's runtime behavior, based on observed behavior of the application itself and on external status information, as provided by the application-level information cache component Tuning components can be either passive, or active, in the latter case carrying their own threads of activity

Application Manager An application manager establishes a pro-active user

interface, in charge of tracking an application from submission to suc-cessful completion It will be in charge of guaranteeing such sucsuc-cessful

Trang 8

completion in spite of temporary error conditions or performance limita-tions A persistent service will become an integral part of this function-ality

3, The SEGL system architecture

User Workstation Experiment designer

Exp Monitor VIS

Exp Engine Resource Monitor

Exp Monitor Supervisor Grid Adapter

Dala, DPA , ' ' Job > RB Data, Parameter

y'^

^y

^'-Sub Server

File Server

/ \

^ ^''

^ - ^ ' ^

:""X'"

Sub Server

Target Machine A

C'^^ ^ '^'^i

I/O Data ', J<^^

- T Z - y - T " "

Sub Server

Target Machine K

._i^.^

J

Exp

DB Server

1 "•'"••^^•^:Si:AiS:;S¥A

j

Figure 2 Current SEGL architecture

Figure 2 shows the current system architecture of SEGL It consists of three main components: the User Workstation (Client), the Experiment Application Server (ExpApplicationServer), and the Experiment database server (ExpDB-Server) Client and ExpApplicationServer communicate with each other using

a traditional client/server architecture, based on J2EE middleware The inter-action between ExpApplicationServer and the Grid resources is done through

a Grid Adaptor, interfacing to Globus [11] and UNICORE [15] middleware The client on the user's workstation is composed of the graphical experiment designer tool (ExpDesigner) and the experiment process monitoring and

Trang 9

visu-alization tool (ExpMonitorVIS) The ExpDesigner is used to design, verify and generate the experiment's program, organize the data repository and prepare the initial data, using a simple graphical language

Each experiment is described at three levels: control flow, data flow and the data repository The control flow level describes which code blocks will

be executed in which order, possibly augmented by parameter iterations and conditional branches Each block can be represented as a simple parameter study An example is shown in Fig 3 The data flow level describes the flow

of parameter data between the individual code blocks On the data repository level, a common description of the metadata repository is created for the given experiment The repository is an aggregation of data from the blocks at the data flow level

Block 1.1|

Solver

Block 1.2|

Solver

Block 1.3|

Solver

Block 3

Branch

Block 2.3

Solver

Block 4.1

\ Walt

Block 2.4|

Solver

Block 3.2|

Solver

Block 2.^

Solver

:ik -^ii

Block 4.2 Wait

i £

Block 5.1

Solver

Figure 3 Example experiment control flow

After completing the graphical design of the experiment program, it is

"com-piled" to the container application This creates the experiment-specifc parts

for the ExpApplicationServer as well as the experiment's data base schema The container application of the experiment is transferred to the ExpApplica-tionServer and the schema descriptions are transferred to the server data base Here, the meta data repository is created

Trang 10

The Exp Applications erver consists of the experiment engine (ExpEngine), the container application (Task), the controller component (ExpMonitorSuper-visor) and the ResourceMonitor The ResourceMonitor holds information about the available resources in the Grid environment The MonitorSupervisor con-trols the work of the runtime system and informs the Client about the current status of the jobs and the individual processes The ExpEngine is executing the application Task, so it is responsible for actual data transfers and program executions on and between server machine in the Grid

The final component of SEGL is the data base server (ExpDBServer) The automatic creation of the experiment is done according to the structure designed

by the user All data produced during the experiment such as input data for the parameter study, parameterization rules etc are kept in the ExpDBServer

As SEGL parameter studies may run for significant amounts of time, appli-cation progress monitoring becomes necessary The MonitorSupervisor, being part of the experiment application server, monitors the work of the runtime sys-tem and notifies the client about the current status of the jobs and the individual processes The ExpEngine is the actual controller of the SEGL runtime system

It consists of three sub systems: the TaskManager, the JobManager and the DataManager The TaskManager is the central dispatcher of the ExpEngine It coordinates the work of the DataManager and the JobManager as follows:

1 It organizes and controls the execution sequence of the program blocks

It starts the execution of the program blocks according to the task flow and the conditions within the experiment program

2 It activates a particular block according to the task flow, selects the neces-sary computer resources for the execution of the program and deactivates the block when this section of the program has been executed

3 It informs the MonitorSupervisor about the current status of the program

The DataManager organizes data exchange between the Applications erver and the FileServer and between the FileServer and the ExpDBServer Fur-thermore, it provides the tasks processes with their the input parameter data For progress monitoring, the MonitorSupervisor is tracking the status of the ExpEngine and its sub components It forwards status update events to the ExpMonitorVIS, closing the loop to the user SEGL's progress monitoring is currently split in to parts:

1 The experiment monitoring and visualization on the client side (ExpMon-itor VIS) It is designed for visualizing the execution of the experiment and its computation processes The ExpMontitorVis allows the user to start, stop, the experiment, and to change the input data and to subse-quently re-start the experiment or some part of it

Định dạng
Số trang	20
Dung lượng	1,11 MB