Design and Evaluation of a Resource Selection Framework for Grid Applications

We present a general-purpose resource selection framework that addresses these problems by defining a resource selection service for locating Grid resources that match application requir

Trang 1

Design and Evaluation

of a Resource Selection Framework

for Grid Applications

A thesis submitted in partial satisfaction of the requirements for the degree Master of Science in Computer Science

By

Chuang Liu

Committee in Charge

Professor Ian Foster

Professor Michael J O'Donnell

Professor Jennifer M Schopf

University of Chicago

Trang 2

April 12, 2002

Trang 3

While distributed, heterogeneous collections of computers (“Grids”) can in principle be used as acomputing platform, in practice the problems of first discovering and then configuring resources to meetapplication requirements are difficult problems We present a general-purpose resource selection

framework that addresses these problems by defining a resource selection service for locating Grid

resources that match application requirements At the heart of this service is a simple but powerfuldeclarative language based on a technique called set matching, which extends the Condor matchmakingframework to support both single resource and multiple resource selection This framework also provides

an open interface for loading application-specific mapping modules to personalize the resource selector

We present results obtained when this framework is applied in the context of a computationalastrophysics application, Cactus These results demonstrate the effectiveness of our technique

Trang 4

This thesis would not be possible without the help of the following people

Ian Foster, my advisor, who gave me the chance to be involved in this interesting project and guide me finish this work with his insightful feedback

Lingyun Yang, my wife and my teammate, who help me out when things are frustrating As my

teammate, she helped me finish most of the experiments and checked every sentence in this thesis Dave Angulo, my teammate, who give us a lot of help on cactus and globus stuffs

Jennifer M Schopf, Alain Roy and Michael J O'Donnell who reviewed my thesis and provided valuable insight in area that needed improvement and corrections

All people in GrADS group who provided the test bed for our experiment

Condor Group and Cactus Group also deserve big thanks for providing me the wonderful software package and instantly response to my question

This work was supported by the Grid Application Development Software (GrADS) project of the NSF Next Generation Software program, under Grant No 9975020

Trang 5

1 Introduction 5

2 Set-Extended ClassAds and Set Matching 6

2.1 An Overview of Condor ClassAds and Matchmaking 6

2.2 Set-Extended ClassAds Syntax and Set Request 7

2.2.1 Set-Extended ClassAds Syntax 7

2.3 Set-Matching Algorithm 8

3 Resource Selection Framework 9

3.1 System Architecture 9

3.2 Resource Request 10

3.3 Resource Selection Result 11

4 Cactus Application 12

4.1 Performance Model 12

4.2 Mapping Algorithm 13

5 Experimental Results 13

5.1 Execution Time Prediction Test 13

5.1.1 Computation Time Prediction Test 13

5.1.2 Computation Time and Communication Time prediction Test 14

5.2 Mapping Strategy Test 15

5.3 Resource Selection Algorithm Test 16

6 Conclusion and Future Work 17

Trang 6

The development of high-speed networks (10 Gb/s Ethernet, optical networking) makes it feasible, inprinciple, to execute even communication-intensive applications on distributed computation and storageresources However, the discovery and configuration of suitable resources for applications inheterogeneous environments remain challenging problems Like others [1-6], we postulate the existence

of a Resource Selector Service (RSS) responsible for selecting Grid resources appropriate for a particular

problem run based on that run’s characteristics, organizing those resources into a virtual machine with anappropriate topology, and potentially also assisting with the mapping of the application workload to

virtual machine resources These three steps―selection, configuration, and mapping―can be interrelated,

as it is only after a mapping has been determined that the selector can determine whether one selection isbetter than another

Many projects have addressed the resource selection problem Systems such as NQE [7], PBS [8], LSF[9], I-SOFT [10], and Load Leveler [11] process user-submitted jobs by finding resources that have beenidentified either explicitly through a job control language or implicitly by submitting the job to aparticular queue that is associated with a set of resources This manually configured queue hinders thedynamic resource discovery Globus [12] and Legion [13], on the other hand, present resourcemanagement architectures that support resource discovery, dynamical resource status monitor, resourceallocation, and job control These architectures make it easy to create a high-level scheduler Legion alsoprovides a simple, generic default scheduler But Dail et al [14] show that this default scheduler caneasily be outperformed by a scheduler with special knowledge of the application

The AppLeS framework [2] guides the implementation of application-specific scheduler logic, whichdetermines and actuates a schedule customized for the individual application and the target computationalGrid at execution time Dongarra et al developed a more modular resource selector for a ScaLAPACKapplication [1] Since they embed the application-specific detail in the resource selection module,however, their tools cannot be used easily for other applications Systems such as MARS [15], DOME[16], and SEA [17] target particular classes of application (MARS and SEA target applications that can berepresented by dataflow-style program graph, and DOME targets SIMD applications) Furthermore,neither the user nor the owner of resources can control the resource selection process in these systems

Condor [3] provides a general resource selection mechanism based on the ClassAds language [18], which

allows users to describe arbitrary resource requests and resource owners to describe their resources A

matchmaker [19] is used to match user requests with appropriate resources When multiple resources

satisfy a request, a ranking mechanism sorts available resources based on user-supplied criteria andselects the best match Because the ClassAds language and the matchmaker were designed for selecting asingle machine on which to run a job, however, it has limited applicability in the situation where a jobrequires multiple resources

To address these problems, we define a set-extended ClassAds Language that allows users to specify aggregate resource properties (e.g., total memory, minimum bandwidth) We also present an extended set

matching matchmaking algorithm that supports one-to-many matching of set-extended ClassAds with

resources Based on this technique, we present a general-purpose resource selection framework that can

be used by different kinds of applications Within this framework, both application resource requirementsand application performance models are specified declaratively, in the ClassAds language, while mappingstrategies can be determined by user-supplied code (An open interface is provided which allows users toload the application specific mapping module to customize the resource selector.) The resource selectorlocates sets of resources that meet user requirements, evaluates them based on specified performancemodel and mapping strategies, and returns a suitable collection of resources, if any are available We alsopresent results obtained when this technique was applied in the context of a nontrivial application, Cactus[20, 21]

Trang 7

This paper is organized as follows: In Section 2, we present the set-extended ClassAds language and theset matching mechanism In Section 3, we describe the resource selector framework In Section 4, wedescribe a performance model and mapping strategy of the Cactus application used in our case study.Experimental results are presented in Section 5 Finally, we summarize our work and briefly discussfuture activities.

1 Set-Extended ClassAds and Set Matching

We describe here our set-extended ClassAds language and set-matching algorithm

1.1 An Overview of Condor ClassAds and Matchmaking

The ClassAd/Matchmaking formalism comprises three principal components [19]:

1 The ClassAd specification, which defines a language for expressing properties of an entity andany constraints placed on a matching entity, and a semantics of evaluating these attributes;

2 The advertising protocol, which defines basic conventions regarding what a matchmaker expects

to find in a classad if the ad is to be included in the matchmaking process, and how thematchmaker expects to receive the ad from the advertiser

3 The matching-making algorithm, which defines how the contents of ads relate to the outcome ofthe matchmaking process

The ClassAd language[22] is a simple expression-based language The central construct of the language isthe ClassAd (Classified Advertisement), which is a record-like structure composed of a finite number ofdistinctly named expressions Classads are used as attribute lists by entities to describe their characteris-tics, constraints and preferences Attribute expressions can be simple constants or a function of other at-tributes

The classad language differentiates between expressions and values: Expressions are evaluable languageconstructs obtained by parsing valid expression syntax, whereas values are the results of evaluating ex-

pressions The ClassAd language employs dynamic typing (or latent typing), so only values (and not

ex-pressions) have types The language has a rich set of types and values which includes many traditionalvalues (numeric, string, boolean), non-traditional values (timestamps, time intervals) and some esoteric

values, such as undefined and error Undefined is generated when an attribute reference cannot be solved, and error is generated when there are type errors In a sense, all ClassAd operators are total func-

re-tions, since they have a defined semantics for every possible operand value, facilitating robust evaluationsemantics in the uncertain semi-structured environment The operators are essentially those of the C lan -guage, with certain operators excluded (e.g., pointer and de-reference operators) and others added (e.g.,non-strict comparison) Thus, a rich set of arithmetic, logic, bit-wise and comparison operators are de-fined The set of supported operators and their elative precedence are summarized in [22] Figure 1 shows

a ClassAd that describes a Resource Request and two ClassAds that describe two resources

Request=[ requirements = other.type=="machine"

&& other.cpuspeed > 500M && other.memory > 100M;

rank = other.memory + other.cpuspeed

] ResourceA=[ name="foo"; type="machine";

Trang 8

In the matchmaking framework, customer and provider describe themselves by ClassAds The advertisingprotocol specifies particular meaning to some attributes of these ClassAds, for example, ‘requirements’ in

a classads indicates its requirements to its matched classads and ‘rank’ indicates the quality of the match

The matching-making mechanism is built on the evaluation mechanism of ClassAd Two ClassAds match

if expressions named “requirements” in both ClassAds evaluate to true If no ‘requirements’ expression is

mentioned explicitly in a ClassAd, the matchmaker assumes it has ‘requirements’ expression that is

evaluated to true An expression named “Rank” is evaluated to a numerical value representing the quality

of the match To perform the match, the matchmaker evaluates expression in an environment that allows

each ClassAd to access attributes of the other An attribute reference of the form “self.attribute-name”

and “attribute-name refer to another attribute in the same ClassAds containing the reference, while

“other.attribute-name” refers to an attribute of the other ClassAd For example, in Figure 1, the

sub-expression “other.memory > 100M” in the Request ClassAd represents user’s requirement for a machine

with memory at least 100M It evaluates to true if the ‘other’ refers to the ad ResourceA because sub

expression “other.memory” will be replaced by the value of attribute “memory” in ad ResourceA that is

equal to 512M

When matchmaking is used for resource selection, the matchmaker evaluates a ClassAd request withevery available resource ClassAd and then selects a resource that both matches the request and returns thehighest rank For example, in Figure 1, ClassAd A represents a request while ClassAd B and C representresources ClassAd A will match with both ClassAd B and C because both have cpuspeed faster than500M and memory size bigger than 100M But machine “foo” described by ClassAd B is better than

“bar” because “foo” has more memory and faster cpuspeed – and thus higher rank

So-called gang matching [22] extends the basic matchmaking algorithm to allow two or more ClassAds to

be specified in one request; a successful match must then return a match for each of the suppliedClassAds However, gang matching does not address our need to locate a set of resources that satisfysome collective criteria

1.2 Set-Extended ClassAds Syntax and Set Request

In set matching, a successful match is defined as occurring between a single set request and a resource

set The essential idea is as follows The set request is expressed in set-extended ClassAds syntax, which

is identical to that of a normal ClassAd except that it can indicate both set expressions, which place

constraints on the collective properties of an entire resource ClassAd set (e.g., total memory size) and

individual expressions, which must apply individually to each resource in the set (e.g., individual

per-resource memory size) The set-matching algorithm attempts to construct a per-resource set that satisfies bothindividual and set constraints This set of resources is returned if the set match is successful

1.2.1 Set-Extended ClassAds Syntax

The set-extended ClassAd language, as currently defined, extends ClassAds as follows:

 A Type specifier is supplied for identifying set-extended ClassAds: the expression Type=”Set”

identifies a set-extended ClassAd

 Three aggregation functions, Max, Min, and Sum, are provided to specify aggregate properties ofresource sets

 A Boolean function suffix(V, L), where V is a string and L is a string list [22], is definedthat returns true if a member of list L is the suffix of string V

 A function SetSize is defined that can be used to refer to the number of elements within thecurrent resource set

Trang 9

Three aggregation functions are as follows.

of the ClassAds in a set

each of the ClassAds in a set For example, Sum(other.memory)>5G means the total memory

of the set of resources selected should be greater than 5G

Aggregation functions might be used as follows If a job consists of several independent subtasks that run

in parallel on different machines, its execution time on a resources set is decided by the subtask that endslast If these subtasks have the same performance model that can be described by a expression named

A user can use the suffix function to constrain the resources considered when performing set matching,

to those within particular domains For example, suffix(H, {“ucsd.edu”, “utk.edu”}) returns

true for H=“torc1.cs.utk.edu” because “utk.edu” is the suffix of “torc1.cs.utk.edu.”

1.3 Set-Matching Algorithm

The set-matching algorithm evaluates a set-extended ClassAd request against a set of resource ClassAdsand returns a resource set that has highest rank It comprises two phases

In the filtering phase, individual resources are removed from consideration based on individual

expressions in the request For example, individual expressions "other.os==redhat6.1 &&

with less than 100 Mb of memory A suffix expression can also be used in this phase, as discussedabove A set-matching implementation can index ClassAds to accelerate such filtering operations

C a n d i d a t e S e t = N U L L ;

B e s t S e t F o u n d = F a l s e ;

L a s t R a n k = - ∞ ; R a n k = - ∞ ;

w h i l e ( R e s o u r c e S e t > N U L L ) {

i f ( ! B e s t S e t F o u n d ) r e t u r n f a i l u r e

e l s e r e t u r n B e s t S e t

Figure 2 The Set Match algorithm

Trang 10

In the set construction phase, the algorithm seeks to identify a resource set that best meets application

requirements As the number of possible resource sets is large (exponential in the number of resourcesavailable), it is not typically feasible to evaluate all possible combinations Instead, we use the followinggreedy heuristic algorithm to construct a resource set from the resources remaining after Phase 1 filtering

In narrative form, the algorithm repeatedly removes the “best” resource remaining in the resource pool(with “best” being determined by the rank of the resulting resource set formed) and adds it to the

“candidate set.” If this “candidate set” has higher rank than the “best set” so far, the “candidate set”become the new “best set” This process stops when the set of resources in the resource pool is exhausted.The algorithm returns the “best set” that satisfies the user’s request, or failure if no such resource set isfound

This algorithm can adapt to different kinds of resource requests It checks whether the candidate resourceClassAd fulfills the requirements expressed in the resource request and calculates the rank of the resourceset based on the evaluation of the two expressions named as “requirements” and “rank” in the requestClassAd Thus, by these two expressions, the user can instruct the matching algorithm to select a resourceset with particular characteristics (as long as these characteristics can be described by expressions) Thisalgorithm can also help the user to choose the ClassAd set on which an application can get a preferredperformance, for example, one on which the application can finish its work before a deadline

The greedy nature of our algorithm means that it is not guaranteed to find a best solution if one exists.The set-matching problem can be modeled as an optimization problem under some constraints Since thisproblem is NP-complete in some situations, it is difficult to find a general algorithm to solve the problemefficiently, especially when the number of resources is large Our work provides an efficient algorithm

with complexity O(N 2 ) with rank computation as the basic operation, where N is the number of ClassAds

after the filtering phase

2 Resource Selection Framework

We have implemented a general-purpose resource selection framework based on the set-matchingtechnique It accepts user resource requests and finds a set of resources with highest rank based on theresource information provided by Grid Information Service It also provides an open interface for users tospecify the application-specific mapping module to customize the resource selector

Figure 3: Architecture of Resource Selector

The Grid Information Service is provided by MDS [23] and NWS [24-26] The Meta Directory Service

(MDS) is a component of Globus Toolkit [27] It provides a uniform framework for discovering and

Định dạng
Số trang	21
Dung lượng	398,5 KB