1. Trang chủ
  2. » Ngoại Ngữ

University of California Gird Project Prepared by the UC Research Computing Group

21 4 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 21
Dung lượng 1,28 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

We propose to do this by deploying the UCLA Grid Portal UGP software, and applying the UCLA Grid architecture, which integrates computing resources into a Grid by the attachment of Grid

Trang 1

University of California Gird Project Prepared by the UC Research Computing Group

I Summary

This white paper has been developed by the UC Research Computing Group (UCRCG)

in response to the directive by the Information Technology Guidance Committee’s (ITGC) High-Performance Computing Work Group that a secure Computing Grid be developed to link together hardware resources and autonomous organizations at the University of California The Grid is to provide networked resources: computing,

storage, and network technology resources in support of research

The first task is to provide a Grid infrastructure in order to expose existing computing resources to the UC research community and to facilitate the use of those resources as appropriate to existing research needs and funding The UCRCG proposes to provide this infrastructure by:

1 Creating a Campus Grid at each University of California campus We propose to

do this by deploying the UCLA Grid Portal (UGP) software, and applying the UCLA Grid architecture, which integrates computing resources into a Grid by the attachment of Grid Appliances The use of Grid Appliances allows for the

attachment of independently owned compute clusters to the Grid without

changing the way the administrators of those clusters do business UGP provides

a single intuitive web-based interface to all of the resources in a Grid UGP and the Grid Appliances were developed by UCLA Academic Technology Services (ATS), which has been successfully running a Grid at UCLA since June 2004 Each Campus Grid will expose independently operated campus resources to all researchers belonging to that campus

2 Creating a UC-wide Grid called the UC Grid The UC Grid will allow for the sharing of resources from different campuses Every user of a Campus Grid will also be able to access the UC Grid to use multi-campus resources The UC Grid will deploy the same UGP software as the Campus Grids, thus providing the sameuser interface as the Campus Grids do It will connect to the Grid Appliances already installed on the campuses as part of the Camps Grids; additional Grid Appliances will not be required

3 Deploying the Grid Certificate Authority (CA) for all the Campus Grids and the

UC Grid at the UC Grid level This will provide each user with a single

credential that will be recognized by all the Grids, campus and UC, thus making the sharing of computer resources between campuses possible and with single sign on Grid certificates meet the X.509 certificate standard [1] Adoption of the Grid CA and certificates will not prevent the adoption of another UC-wide

authentication service at a later date

4 Using resource pools to provide researchers with the most appropriate compute resources and software anywhere in the UC system according to their compute

Trang 2

requirements This will allow each Grid Portal to manage resource allocation within its Grid in order to optimize performance and utilization.

This initial deployment will fulfill the following mandates:

 To develop a secure Grid of computing and storage resources in support of research

 To augment UC’s research technology infrastructure in an interoperable fashion

to facilitate the sharing of resources

 To optimize performance and utilization

 To deliver services that address researchers’ needs while encouraging behavior that benefits the common good

It will also provide easy access to a very large number of users without having to create individual user login ids for them on all of the clusters

After deployment, parameters and codes can be optimized and tuned to maximize

performance, stability, security, and resource utilization while at the same time ensuring the fastest turnaround for users

Funding models and infrastructure at the campus level will have to be addressed in order

to create the Campus Grids Currently, different funding models for computing are in use

at the different campuses The creation of the Campus Grids will require each campus toprovide:

 A Grid Administrator to install and maintain the UGP software; install and provide the Grid Appliances to the administrators of the compute resources on that campus; and provide some user services, such as adding and approving users,keeping the Grid software infrastructure up to date, etc

 At least 3 computers and a Storage Area to act as the Campus Grid Portal

Additional computers will be required to provide load balancing when usage increases

 One computer (the Grid Appliance) for each computational resource (usually a compute cluster) that joins the Campus Grid

In addition to leveraging short-term funding opportunities initially, funding models will have to be developed in the long term that can sustain the Grid services and the

technologies behind them

Currently the emphasis in developing the UCLA Grid Architecture and the UGP softwarehas been to join computational clusters together into a Grid To extend the UC Grid concept to enable the creation of a California Grid for use within K-12 education, will require that Grid services be expanded to provide for other services in addition to batch computing This will require an assessment of needs as well as the development

necessary to a) connect the kinds of compute resources that meet those needs into the Grid and b) add the user interfaces for those kinds of resources into the UGP software This will be addressed in a later phase of Grid development

Trang 3

II Grids

A Grid is a collection of independently owned and administered resources which have been joined together by a software and hardware infrastructure that interacts with the resources and the users of the resources to provide coordinated dynamic resource sharing

in a dependable and consistent way according to policies that have been agreed to by all parties Because of the large number of resources available on a Grid, at any given time,

an individual researcher can always be provided with the best resources for his/her needs,and overall, resource utilization can be distributed for maximum efficiency

In 1969 Leonard Kleinrock, the UCLA professor who was one of the founders of the ARPA net (now the Internet) stated: "We will probably see the spread of ‘computer utilities’, which, like present electric and telephone utilities, will service individual homesand offices across the country." A Grid, is such a "computer utility" It presents a

uniform interface to resources that exist within different autonomous administrative domains

The Globus Alliance [2], a consortium with contributing members from universities, research centers, and government agencies, conducts research and does software

development to provide the fundamental technologies behind the Grid The Globus ToolKit [3] software, developed by the Globus Alliance, forms the underpinning of the most Grids This toolkit implements a command line interface and is thus not

recommended for end users because of its detailed command syntax and long learning time The UCLA Grid Portal (UGP) software, built on top of Globus ToolKit 4.0 and GridSphere [4], uses Java portlets and AJAX technology [5] to provide and easy-to-use web-based interface for Grids

III The UCLA Grid Architecture and the UCLA Grid Portal (UGP) Software

UGP and the UCLA Grid Architecture bring computational clusters together into a Grid, The hardware resources making up the Grid consist entirely of computational clusters each of which consists of a head node, compute nodes, storage nodes, network resources, software, and data resources Individual computational clusters can be quite large,

containing hundreds of nodes

By incorporating the concepts of pooled resources and Pool Users, UGP facilitates the sharing of resources among users Administrative overhead is reduced because there is

no longer a need to add individual user login ids on multiple clusters

UGP:

 Provides a single through-the-web interface to all of the clusters in a Grid This interface hides user interface and scheduler differences among clusters and it makes it easy to work with multiple clusters at once

Trang 4

 Provides a single login for users A user logs into the Grid Portal, not into each ofthe individual clusters that the user will use.

 Provides resources both to: users who have login ids on individual clusters, Cluster Users, and users who do not, Pool-Only Users Any person with campus affiliation can easily gain access to resources throughout the Grid by becoming a Pool-Only User

 Is secure to the extent possible by up to date technology Clusters can sit behind firewalls if that is their policy A Grid Appliances is open only to the cluster to which it is attached and the Grid Portal Proxy certificates are used for

authentication at every step of the way (between Grid Portal and Grid

Appliances) Users never handle their certificates

At the same time as the UGP presents a uniform appearance to users, the UCLA Grid Architecture provides for a Grid made up of diverse computing environments (hardware, operating systems, job schedulers) and autonomous administrative domains Local organizations own and maintain control of the resources involved, and local

administrative procedures and security solutions take precedence Each of the

participating clusters is attached to the Grid Portal via a Grid Appliance, provided by the organization that administers the Grid and maintained by the Grid administrator, which serves as the gateway between the Grid Portal and that cluster The addition of a Grid Appliance to a cluster in no way modifies policy decisions at the cluster level Any participating cluster can always also be used directly by users who login to the cluster head node, without them having to go through the Grid Portal

A Architecture

The UCLA Grid Architecture is depicted in Figure 1

Trang 5

Figure 1 UCLA Grid Architecture

In Figure 1 a user connects, via a web browser, to a Grid Portal Three additional

machines are joined to the Grid Portal to provide 1) storage for user certificates, 2) storage space for user files, and 3) through the web visualization of user’s data Two computational clusters are depicted at the right side of Figure 1 Each cluster consists of compute nodes and a head node (In the absence of a Grid Portal, users normally logon to

a cluster via its head node.) The Grid Appliance, which acts like an alternative head node (and submission host for the job scheduler) for the Grid Portal only, connects the Grid Portal to the cluster User’s home directories from the cluster it is attached to must be cross-mounted on the Grid Appliance Both the Grid Portal and the Grid Appliances run the Globus ToolKit (which has been customized on the Appliances) The Grid Portal additionally runs the Apache Tomcat web server, MySQL [6], GridSphere, and the UGP software (by ATS)

B User Types on the Campus Grids

Two types of users are supported by UGP:

 Cluster Users A cluster user has a login id on one or more of the clusters participating in the Grid A cluster user can get this login id by being a member

of a research group that owns one of the clusters Someone with computational needs can normally also apply for a login id on any cluster that is provided as a campus service

o Cluster users have home directories on each of the clusters they can access They use their home directories to store files

o Cluster users can use the Grid Portal to access files on and submit jobs to the clusters they have access to

o Cluster users can also submit jobs to resource pools as a Pool User

 Pool-Only Users – Students, staff, and faculty members who do not have login ids

on any of the clusters can easily sign up on the Grid Portal to be Pool-Only Users

o Each Pool-Only User is assigned a storage area on the Storage Server connected to the Grid Portal

o The Pool-Only User can submit jobs to resource pools

C The Resource Pool

Clusters which have contractual obligations with their granting agencies: NSF, NIH etc.,

to provide a fixed percentage of their compute cycle to the campus can share those cycleswith the campus community by joining the campus resource pool Clusters that are provided solely as campus resources can also join the resource pool Clusters contribute both cycles and applications to the resource pool and a cluster administrator can

determine which of the applications available on that cluster to contribute to the pool (The Grid administrator does not take any responsibility for the application updates or

Trang 6

maintenance on individual clusters That is the responsibility of each cluster

administrator.)

Pooled resources are available for use by anyone who can login to the Grid Portal Currently pooled resources run applications only When a user submits a pool job, UGP selects that cluster which will give that job the best turnaround from among the clusters that are contributing the application requested to the pool

D Services provided by UGP

Services currently provided by UGP include:

 Resources – Allows one to see at a glance, the status of all the clusters Both summarized status information, and detailed status information is provided

 Data Manager – Allows one to:

o List and manage files on the clusters and the Storage Server including all services normally provided for files in compute environments: create, remove, move, copy, change permissions, create directory,

compress/uncompress, etc

o View and edit text files, view image files, visualize computational results

o Copy files and directories between a cluster or the Storage Server and the user’s local machine (upload/download)

o Copy files and directories between clusters or between a cluster and the Storage Server

 Job Services – Allows one to submit a job and view the job status and results Special application services provide easy access to all popular applications Cluster users can submit jobs to specific clusters All users can submit jobs to theresource pool When a job is submitted to the resource pool, using a best fit algorithm, UGP selects the cluster to run it and stages all the input files to that cluster from any accessible cluster or Storage Server Once the job has

completed, it is the user’s responsibility to transfer the output files from the cluster on which the job has run to a more permanent location

 Other Grids – Provides Data Manager and Job Services for clusters, not part of the Grid connected to the Grid Portal the user is using, but which are part of other Grids that are open and not behind firewalls The MyProxy Server [7] for the other grid muse also be available to UGP Currently service is provided to several clusters that are part of the TeraGrid To use a cluster on another Grid, a user must enter his/her certificate username/passphrase on that Grid into a form

provided by UGP UGP then retrieves the user proxy certificate from a MyProxy

Trang 7

server on that other Grid and uses that proxy certificate to access the requested outside cluster.

 Grid Development Environment – Provides a development environment for the editing and compilation of C/C++ and Fortran source codes on the clusters

IV Expanding the UCLA Grid Architecture to Encompass the University of California

The diagram in Figure 2 is a simplified version of the architecture shown in Figure 1 In

it, a box labeled “C” represents an entire cluster and a box labeled “A” represents a Grid Appliance The ION Visualization Server is not shown because not all campuses may have an ION Visualization Server The campus Grid Portal includes a CA (Certificate granting Authority) for the Grid This is the way the Grid Portal at UCLA is currently configured With the advent of the UC Grid, the Grid Portal at each campus will no longer include a CA as the CA for the all of the University of California Grids, both Campus Grids and the UC Grid, will be at the UC Grid Portal

Figure 2 Single-Campus Architecture

Figure 3 depicts the Multi-Campus Grid architecture for the University of California This figure depicts the Campus Grids for three campuses and UC Grid The Campus Grid shown for each campus is identical to the one shown in Figure 2 except that a CA is not included at the campus level A single CA for the Grid is included as part of the UC Grid and a special service, the UC Register Service, has been added to the UC Grid Portal

Trang 8

Note also that each Grid Appliance must be open to both the Campus Grid Portal and the

UC Grid Portal

Figure 3 Multi-Campus Architecture for the University of California

This design allows:

 Each user of a Campus Grid to also use the UC Grid Portal

 Each Cluster User to access the Campus Grid Portal of each campuses whose Grid includes clusters on which that user has a login id

 The UC Grid Portal to access every cluster that belongs to each of the Campus Grids, i.e., every cluster that participates in a Campus Grid also participates in the

UC Grid

 Clusters at the campus level to contribute both cycles and applications to the UC resource pool in addition to the campus resource pool of the local campus The clusters that contribute cycles to the UC resource pool, and the applications they contribute to that pool, do not have to be the same as the ones that contribute to the Campus resource pool Contributing to the resource pools is not a

requirement for a cluster to join the Grid

When a cluster administrator wants to join his/her cluster to the Campus Grid, he/she must also join his/her cluster to the UC Grid This is a requirement

Trang 9

A Grid Certificate Authority, Grid Certificates and MyProxy Servers

The Globus ToolKit uses public key cryptography The UC Grid Portal has a Simple Certificate Authority for the Grid (Grid CA) associated with it When a user requests a username in order to access one of the Campus Grids in the UC system, that user will be issued a certificate signed by the UC Grid’s CA The certificate consists of two parts, thepublic key and the private key With the UCLA Grid Architecture, these are never

returned to the user Instead, the certificate is automatically digitally signed by the CA and the public key and private keys are stored in two MyProxy servers, one at the UC Grid Portal and the other at the Campus Grid Portal The digital signature of the CA guarantees that the certificate has not been tampered with The user never handles the certificate and may not even know that a certificate exists

To use a Grid Portal, a user must login by providing his/her username and passphrase This provides access to the user’s private key Once the UC Grid has been set up, when auser logs into the UC Grid Portal, that portal will look up the user in its MyProxy Server; when a user logs into a Campus Grid Portal, that Grid Portal will look up the user in its own MyProxy Server If for some reason, its MyProxy Server is unavailable or the user

is not found there, the Campus Grid Portal can look for the user in the MyProxy Server belonging to the UC Grid Once the user has been validated, UGP will retrieve a proxy certificate for the user from the MyProxy Server The proxy certificate has a limited lifespan, normally one day, The Grid Portal uses that proxy certificate on the user’s behalf every time it contacts one of the clusters, via its Grid Appliance, to perform a service for that user The proxy certificate is destroyed once the user logs out

B User Types on the UC Grid

The UC Grid will have two types of users:

 Cluster Users A Cluster User is a user that has a login id on at least one cluster

at least one campus

 Pool-Only Users – There is no need to assign a storage area on the Storage Server connected to the UC Grid Portal; the UC Grid Portal can access the user’s files that are on the Storage Server at the user’s local Campus Grid Portal

Use of the UC Grid Portal is the best choice for Cluster Users with access to clusters on different campuses as all clusters UC-wide that that user can access will be accessible from the UC Grid portal Use of the UC Grid Portal will be advantageous for Pool-Only users only to the extent that the UC Portal can solicit cluster administrators UC-wide to contribute to its resource pool The UC Portal is the only Grid Portal from which users can submit jobs to the UC resource pool

C Workflow to add a User

Trang 10

The workflow required to add a user to the Grid always begins at the Campus Grid Portalbecause it is on the Campus Grid level where the user has the strongest affiliation and is known The workflow, depicted in Figures 4 and 5 always results in a user who has been added to both his/her Campus Grid Portal and to the UC Grid Portal.

Figure 4 Workflow to add a User, Part 1

Ngày đăng: 19/10/2022, 21:53

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

🧩 Sản phẩm bạn có thể quan tâm

w