We propose to do this by deploying the UCLA Grid Portal UGP software, and applying the UCLA Grid architecture, which integrates computing resources into a Grid by the attachment of Grid
Trang 1University of California Gird Project Prepared by the UC Research Computing Group
I Summary
This white paper has been developed by the UC Research Computing Group (UCRCG)
in response to the directive by the Information Technology Guidance Committee’s (ITGC) High-Performance Computing Work Group that a secure Computing Grid be developed to link together hardware resources and autonomous organizations at the University of California The Grid is to provide networked resources: computing,
storage, and network technology resources in support of research
The first task is to provide a Grid infrastructure in order to expose existing computing resources to the UC research community and to facilitate the use of those resources as appropriate to existing research needs and funding The UCRCG proposes to provide this infrastructure by:
1 Creating a Campus Grid at each University of California campus We propose to
do this by deploying the UCLA Grid Portal (UGP) software, and applying the UCLA Grid architecture, which integrates computing resources into a Grid by the attachment of Grid Appliances The use of Grid Appliances allows for the
attachment of independently owned compute clusters to the Grid without
changing the way the administrators of those clusters do business UGP provides
a single intuitive web-based interface to all of the resources in a Grid UGP and the Grid Appliances were developed by UCLA Academic Technology Services (ATS), which has been successfully running a Grid at UCLA since June 2004 Each Campus Grid will expose independently operated campus resources to all researchers belonging to that campus
2 Creating a UC-wide Grid called the UC Grid The UC Grid will allow for the sharing of resources from different campuses Every user of a Campus Grid will also be able to access the UC Grid to use multi-campus resources The UC Grid will deploy the same UGP software as the Campus Grids, thus providing the sameuser interface as the Campus Grids do It will connect to the Grid Appliances already installed on the campuses as part of the Camps Grids; additional Grid Appliances will not be required
3 Deploying the Grid Certificate Authority (CA) for all the Campus Grids and the
UC Grid at the UC Grid level This will provide each user with a single
credential that will be recognized by all the Grids, campus and UC, thus making the sharing of computer resources between campuses possible and with single sign on Grid certificates meet the X.509 certificate standard [1] Adoption of the Grid CA and certificates will not prevent the adoption of another UC-wide
authentication service at a later date
4 Using resource pools to provide researchers with the most appropriate compute resources and software anywhere in the UC system according to their compute
Trang 2requirements This will allow each Grid Portal to manage resource allocation within its Grid in order to optimize performance and utilization.
This initial deployment will fulfill the following mandates:
To develop a secure Grid of computing and storage resources in support of research
To augment UC’s research technology infrastructure in an interoperable fashion
to facilitate the sharing of resources
To optimize performance and utilization
To deliver services that address researchers’ needs while encouraging behavior that benefits the common good
It will also provide easy access to a very large number of users without having to create individual user login ids for them on all of the clusters
After deployment, parameters and codes can be optimized and tuned to maximize
performance, stability, security, and resource utilization while at the same time ensuring the fastest turnaround for users
Funding models and infrastructure at the campus level will have to be addressed in order
to create the Campus Grids Currently, different funding models for computing are in use
at the different campuses The creation of the Campus Grids will require each campus toprovide:
A Grid Administrator to install and maintain the UGP software; install and provide the Grid Appliances to the administrators of the compute resources on that campus; and provide some user services, such as adding and approving users,keeping the Grid software infrastructure up to date, etc
At least 3 computers and a Storage Area to act as the Campus Grid Portal
Additional computers will be required to provide load balancing when usage increases
One computer (the Grid Appliance) for each computational resource (usually a compute cluster) that joins the Campus Grid
In addition to leveraging short-term funding opportunities initially, funding models will have to be developed in the long term that can sustain the Grid services and the
technologies behind them
Currently the emphasis in developing the UCLA Grid Architecture and the UGP softwarehas been to join computational clusters together into a Grid To extend the UC Grid concept to enable the creation of a California Grid for use within K-12 education, will require that Grid services be expanded to provide for other services in addition to batch computing This will require an assessment of needs as well as the development
necessary to a) connect the kinds of compute resources that meet those needs into the Grid and b) add the user interfaces for those kinds of resources into the UGP software This will be addressed in a later phase of Grid development
Trang 3II Grids
A Grid is a collection of independently owned and administered resources which have been joined together by a software and hardware infrastructure that interacts with the resources and the users of the resources to provide coordinated dynamic resource sharing
in a dependable and consistent way according to policies that have been agreed to by all parties Because of the large number of resources available on a Grid, at any given time,
an individual researcher can always be provided with the best resources for his/her needs,and overall, resource utilization can be distributed for maximum efficiency
In 1969 Leonard Kleinrock, the UCLA professor who was one of the founders of the ARPA net (now the Internet) stated: "We will probably see the spread of ‘computer utilities’, which, like present electric and telephone utilities, will service individual homesand offices across the country." A Grid, is such a "computer utility" It presents a
uniform interface to resources that exist within different autonomous administrative domains
The Globus Alliance [2], a consortium with contributing members from universities, research centers, and government agencies, conducts research and does software
development to provide the fundamental technologies behind the Grid The Globus ToolKit [3] software, developed by the Globus Alliance, forms the underpinning of the most Grids This toolkit implements a command line interface and is thus not
recommended for end users because of its detailed command syntax and long learning time The UCLA Grid Portal (UGP) software, built on top of Globus ToolKit 4.0 and GridSphere [4], uses Java portlets and AJAX technology [5] to provide and easy-to-use web-based interface for Grids
III The UCLA Grid Architecture and the UCLA Grid Portal (UGP) Software
UGP and the UCLA Grid Architecture bring computational clusters together into a Grid, The hardware resources making up the Grid consist entirely of computational clusters each of which consists of a head node, compute nodes, storage nodes, network resources, software, and data resources Individual computational clusters can be quite large,
containing hundreds of nodes
By incorporating the concepts of pooled resources and Pool Users, UGP facilitates the sharing of resources among users Administrative overhead is reduced because there is
no longer a need to add individual user login ids on multiple clusters
UGP:
Provides a single through-the-web interface to all of the clusters in a Grid This interface hides user interface and scheduler differences among clusters and it makes it easy to work with multiple clusters at once
Trang 4 Provides a single login for users A user logs into the Grid Portal, not into each ofthe individual clusters that the user will use.
Provides resources both to: users who have login ids on individual clusters, Cluster Users, and users who do not, Pool-Only Users Any person with campus affiliation can easily gain access to resources throughout the Grid by becoming a Pool-Only User
Is secure to the extent possible by up to date technology Clusters can sit behind firewalls if that is their policy A Grid Appliances is open only to the cluster to which it is attached and the Grid Portal Proxy certificates are used for
authentication at every step of the way (between Grid Portal and Grid
Appliances) Users never handle their certificates
At the same time as the UGP presents a uniform appearance to users, the UCLA Grid Architecture provides for a Grid made up of diverse computing environments (hardware, operating systems, job schedulers) and autonomous administrative domains Local organizations own and maintain control of the resources involved, and local
administrative procedures and security solutions take precedence Each of the
participating clusters is attached to the Grid Portal via a Grid Appliance, provided by the organization that administers the Grid and maintained by the Grid administrator, which serves as the gateway between the Grid Portal and that cluster The addition of a Grid Appliance to a cluster in no way modifies policy decisions at the cluster level Any participating cluster can always also be used directly by users who login to the cluster head node, without them having to go through the Grid Portal
A Architecture
The UCLA Grid Architecture is depicted in Figure 1
Trang 5Figure 1 UCLA Grid Architecture
In Figure 1 a user connects, via a web browser, to a Grid Portal Three additional
machines are joined to the Grid Portal to provide 1) storage for user certificates, 2) storage space for user files, and 3) through the web visualization of user’s data Two computational clusters are depicted at the right side of Figure 1 Each cluster consists of compute nodes and a head node (In the absence of a Grid Portal, users normally logon to
a cluster via its head node.) The Grid Appliance, which acts like an alternative head node (and submission host for the job scheduler) for the Grid Portal only, connects the Grid Portal to the cluster User’s home directories from the cluster it is attached to must be cross-mounted on the Grid Appliance Both the Grid Portal and the Grid Appliances run the Globus ToolKit (which has been customized on the Appliances) The Grid Portal additionally runs the Apache Tomcat web server, MySQL [6], GridSphere, and the UGP software (by ATS)
B User Types on the Campus Grids
Two types of users are supported by UGP:
Cluster Users A cluster user has a login id on one or more of the clusters participating in the Grid A cluster user can get this login id by being a member
of a research group that owns one of the clusters Someone with computational needs can normally also apply for a login id on any cluster that is provided as a campus service
o Cluster users have home directories on each of the clusters they can access They use their home directories to store files
o Cluster users can use the Grid Portal to access files on and submit jobs to the clusters they have access to
o Cluster users can also submit jobs to resource pools as a Pool User
Pool-Only Users – Students, staff, and faculty members who do not have login ids
on any of the clusters can easily sign up on the Grid Portal to be Pool-Only Users
o Each Pool-Only User is assigned a storage area on the Storage Server connected to the Grid Portal
o The Pool-Only User can submit jobs to resource pools
C The Resource Pool
Clusters which have contractual obligations with their granting agencies: NSF, NIH etc.,
to provide a fixed percentage of their compute cycle to the campus can share those cycleswith the campus community by joining the campus resource pool Clusters that are provided solely as campus resources can also join the resource pool Clusters contribute both cycles and applications to the resource pool and a cluster administrator can
determine which of the applications available on that cluster to contribute to the pool (The Grid administrator does not take any responsibility for the application updates or
Trang 6maintenance on individual clusters That is the responsibility of each cluster
administrator.)
Pooled resources are available for use by anyone who can login to the Grid Portal Currently pooled resources run applications only When a user submits a pool job, UGP selects that cluster which will give that job the best turnaround from among the clusters that are contributing the application requested to the pool
D Services provided by UGP
Services currently provided by UGP include:
Resources – Allows one to see at a glance, the status of all the clusters Both summarized status information, and detailed status information is provided
Data Manager – Allows one to:
o List and manage files on the clusters and the Storage Server including all services normally provided for files in compute environments: create, remove, move, copy, change permissions, create directory,
compress/uncompress, etc
o View and edit text files, view image files, visualize computational results
o Copy files and directories between a cluster or the Storage Server and the user’s local machine (upload/download)
o Copy files and directories between clusters or between a cluster and the Storage Server
Job Services – Allows one to submit a job and view the job status and results Special application services provide easy access to all popular applications Cluster users can submit jobs to specific clusters All users can submit jobs to theresource pool When a job is submitted to the resource pool, using a best fit algorithm, UGP selects the cluster to run it and stages all the input files to that cluster from any accessible cluster or Storage Server Once the job has
completed, it is the user’s responsibility to transfer the output files from the cluster on which the job has run to a more permanent location
Other Grids – Provides Data Manager and Job Services for clusters, not part of the Grid connected to the Grid Portal the user is using, but which are part of other Grids that are open and not behind firewalls The MyProxy Server [7] for the other grid muse also be available to UGP Currently service is provided to several clusters that are part of the TeraGrid To use a cluster on another Grid, a user must enter his/her certificate username/passphrase on that Grid into a form
provided by UGP UGP then retrieves the user proxy certificate from a MyProxy
Trang 7server on that other Grid and uses that proxy certificate to access the requested outside cluster.
Grid Development Environment – Provides a development environment for the editing and compilation of C/C++ and Fortran source codes on the clusters
IV Expanding the UCLA Grid Architecture to Encompass the University of California
The diagram in Figure 2 is a simplified version of the architecture shown in Figure 1 In
it, a box labeled “C” represents an entire cluster and a box labeled “A” represents a Grid Appliance The ION Visualization Server is not shown because not all campuses may have an ION Visualization Server The campus Grid Portal includes a CA (Certificate granting Authority) for the Grid This is the way the Grid Portal at UCLA is currently configured With the advent of the UC Grid, the Grid Portal at each campus will no longer include a CA as the CA for the all of the University of California Grids, both Campus Grids and the UC Grid, will be at the UC Grid Portal
Figure 2 Single-Campus Architecture
Figure 3 depicts the Multi-Campus Grid architecture for the University of California This figure depicts the Campus Grids for three campuses and UC Grid The Campus Grid shown for each campus is identical to the one shown in Figure 2 except that a CA is not included at the campus level A single CA for the Grid is included as part of the UC Grid and a special service, the UC Register Service, has been added to the UC Grid Portal
Trang 8Note also that each Grid Appliance must be open to both the Campus Grid Portal and the
UC Grid Portal
Figure 3 Multi-Campus Architecture for the University of California
This design allows:
Each user of a Campus Grid to also use the UC Grid Portal
Each Cluster User to access the Campus Grid Portal of each campuses whose Grid includes clusters on which that user has a login id
The UC Grid Portal to access every cluster that belongs to each of the Campus Grids, i.e., every cluster that participates in a Campus Grid also participates in the
UC Grid
Clusters at the campus level to contribute both cycles and applications to the UC resource pool in addition to the campus resource pool of the local campus The clusters that contribute cycles to the UC resource pool, and the applications they contribute to that pool, do not have to be the same as the ones that contribute to the Campus resource pool Contributing to the resource pools is not a
requirement for a cluster to join the Grid
When a cluster administrator wants to join his/her cluster to the Campus Grid, he/she must also join his/her cluster to the UC Grid This is a requirement
Trang 9A Grid Certificate Authority, Grid Certificates and MyProxy Servers
The Globus ToolKit uses public key cryptography The UC Grid Portal has a Simple Certificate Authority for the Grid (Grid CA) associated with it When a user requests a username in order to access one of the Campus Grids in the UC system, that user will be issued a certificate signed by the UC Grid’s CA The certificate consists of two parts, thepublic key and the private key With the UCLA Grid Architecture, these are never
returned to the user Instead, the certificate is automatically digitally signed by the CA and the public key and private keys are stored in two MyProxy servers, one at the UC Grid Portal and the other at the Campus Grid Portal The digital signature of the CA guarantees that the certificate has not been tampered with The user never handles the certificate and may not even know that a certificate exists
To use a Grid Portal, a user must login by providing his/her username and passphrase This provides access to the user’s private key Once the UC Grid has been set up, when auser logs into the UC Grid Portal, that portal will look up the user in its MyProxy Server; when a user logs into a Campus Grid Portal, that Grid Portal will look up the user in its own MyProxy Server If for some reason, its MyProxy Server is unavailable or the user
is not found there, the Campus Grid Portal can look for the user in the MyProxy Server belonging to the UC Grid Once the user has been validated, UGP will retrieve a proxy certificate for the user from the MyProxy Server The proxy certificate has a limited lifespan, normally one day, The Grid Portal uses that proxy certificate on the user’s behalf every time it contacts one of the clusters, via its Grid Appliance, to perform a service for that user The proxy certificate is destroyed once the user logs out
B User Types on the UC Grid
The UC Grid will have two types of users:
Cluster Users A Cluster User is a user that has a login id on at least one cluster
at least one campus
Pool-Only Users – There is no need to assign a storage area on the Storage Server connected to the UC Grid Portal; the UC Grid Portal can access the user’s files that are on the Storage Server at the user’s local Campus Grid Portal
Use of the UC Grid Portal is the best choice for Cluster Users with access to clusters on different campuses as all clusters UC-wide that that user can access will be accessible from the UC Grid portal Use of the UC Grid Portal will be advantageous for Pool-Only users only to the extent that the UC Portal can solicit cluster administrators UC-wide to contribute to its resource pool The UC Portal is the only Grid Portal from which users can submit jobs to the UC resource pool
C Workflow to add a User
Trang 10The workflow required to add a user to the Grid always begins at the Campus Grid Portalbecause it is on the Campus Grid level where the user has the strongest affiliation and is known The workflow, depicted in Figures 4 and 5 always results in a user who has been added to both his/her Campus Grid Portal and to the UC Grid Portal.
Figure 4 Workflow to add a User, Part 1