Theemphasis of this paper is on the NPACI Grid Portal Toolkit GridPort; we also dis-cuss several Grid portals built using GridPort including the NPACI HotPage.. The software used to buil
Trang 1Building Grid computing portals:
the NPACI Grid portal toolkit
Mary P Thomas and John R Boisseau
The University of Texas at Austin, Austin, Texas, United States
28.1 INTRODUCTION
In this chapter, we discuss the development, architecture, and functionality of the NationalPartnership for Advanced Computational Infrastructure NPACI Grid Portals project Theemphasis of this paper is on the NPACI Grid Portal Toolkit (GridPort); we also dis-cuss several Grid portals built using GridPort including the NPACI HotPage We dis-cuss the lessons learned in developing this toolkit and the portals built from it, andfinally we present our current and planned development activities for enhancing Grid-Port and thereby the capabilities, flexibility, and ease-of-development of portals builtusing GridPort
28.1.1 What are Grid computing portals?
Web-based Grid computing portals, or Grid portals [1], have been established as
effec-tive tools for providing users of computational Grids with simple, intuieffec-tive interfaces foraccessing Grid information and for using Grid resources [2] Grid portals are now being
Grid Computing – Making the Global Infrastructure a Reality. Edited by F Berman, A Hey and G Fox
2003 John Wiley & Sons, Ltd ISBN: 0-470-85319-0
Trang 2developed, deployed, and used on large Grids including the National Science tion (NSF) Partnership for Advanced Computational Infrastructure (PACI) TeraGrid, theNASA Information Power Grid, and the National Institute of Health (NIH) Biomedical
Founda-Informatics Research Network Grid middleware such as the Globus Toolkit provides
powerful capabilities for integrating a wide variety of computing and storage resources,instruments, and sensors, but Grid middleware packages generally have complex userinterfaces (UIs) and Application Programming Interfaces (APIs) Grid portals make thesedistributed, heterogeneous compute and data Grid environments more accessible to usersand scientists by utilizing common Web and UI conventions Grid portals, and otherWeb-based portals, provide developers and users with the capabilities to customize thecontent and presentation (e.g page layout, level of detail) for the set of tools and servicespresented Grid portals can enable automated execution of specific applications, provideexplicit links to discipline-specific data collections, integrate (and hide) data workflowbetween applications, and automate the creation of collections of application output files.Portals can also provide a window to the underlying execution environment, reportingthe availability of resources, the status of executing jobs, and the current load on theGrid resources
The software used to build Grid portals must interact with the middleware running onGrid resources, and in some cases it must provide missing functionality when the Gridmiddleware is not available for a specific resource or it is lacking capabilities needed bythe Grid portal The portal software must also be compatible with common Web serversand browsers/clients Several generalized Grid portal toolkits have emerged that helpsimplify the portal developer’s task of utilizing the complex Grid technologies used forGrid services and making them available via a familiar Web interface With the advent
of Web services, interoperable protocols and standards are now being developed for Gridinformation and other Grid services Web services will further simplify the use of Grids
and Grid technologies and will encourage the use and deployment of more general Grid
applications on the Grid As Grid portal toolkits and the underlying Grid technologies
mature and as Web services standards become more common, Grid portals will becomeeasier to develop, deploy, maintain, and use
Software for creating Grid portals must integrate a wide variety of other software andhardware systems Thus, portals represent an integration environment This is part of theunique role that portals play at the Grid middleware layer: portals drive the integration of
‘lower’ middleware packages and enforce the integration of other toolkits Furthermore,projects such as the GridPort [3], the Grid Portal Development Toolkit [4], and the Com-mon Component Architecture project [5] have demonstrated that integrated toolkits can
be developed that meet the generalized needs of Grid applications as well as Web-basedGrid portals
28.1.2 History and motivation
The NPACI, led by the San Diego Supercomputer Center (SDSC), was initiated in 1997
by the NSF PACI program [6] NPACI is charged with developing, deploying, and
sup-porting an advanced computational infrastructure – hardware, software, and support – to
enable the next generation of computational science NPACI resources include diverse
Trang 3high performance computing (HPC) architectures and storage systems at SDSC and atuniversity partners US academic researchers may apply for accounts on multiple resources
at multiple sites, so NPACI must enable users to utilize this distributed collection ofresources effectively
The NPACI HotPage was developed to help facilitate and support usage of the NPACI
resources The HotPage initially served as an informational portal for HPC users,
espe-cially those with accounts on multiple NPACI systems [7, 8] The World Wide Web hadrecently become established as a popular method for making information available overthe Internet, so the HotPage was developed as a Web portal to provide information aboutthe HPC and the archival storage resources operated by the NPACI partner sites (SDSC,the University of Texas at Austin, the University of Michigan, Caltech, and UC-Berkeley)
As an informational service, the HotPage provided users with centralized access to
tech-nical documentation for each system However, the HotPage also presented dynamic
informational data for each system, including current operational status, loads, and status
of queued jobs The integration of this information into a single portal presented NPACIusers with data to make decisions on where to submit their jobs However, the goals of theHotPage included not only the provision of information but also the capability to use allNPACI resources interactively via a single, integrated Web portal Grid computing tech-nologies, which were supported as part of the NPACI program, were utilized to providethese functionalities In 1999, a second version of the HotPage was developed that usedthe Globus Toolkit [9] Globus capabilities such as the Grid Security Infrastructure (GSI)and the Globus Resource Allocation Manager (GRAM) enabled the HotPage to provideusers with real time, secure access to NPACI resources HotPage capabilities were added
to allow users to manage their files and data and to submit and delete jobs This version
of the HotPage has been further enhanced and is in production for NPACI and for theentire PACI program Versions of the HotPage are in use at many other universities andgovernment laboratories around the world
The mission of the HotPage project has always been to provide a Web portal that
would present an integrated appearance and set of services to NPACI users: a user
por-tal This has been accomplished using many custom scripts and more recently by using
Grid technologies such as Globus The HotPage is still relatively ‘low level’, however, inthat it enables NPACI users to manipulate files in each of their system accounts directlyand to launch jobs on specific systems It was apparent during the development of the
HotPage that there was growing interest in developing higher-level application portals
that launched specific applications on predetermined resources These application portalstrade low-level control of jobs, files, and data for an even simpler UI, making it possiblefor non-HPC users to take advantage of HPC systems as ‘virtual laboratories’ Much
of the functionality required to build these higher-level application portals had alreadybeen developed for the HotPage Therefore, the subset of software developed for Hot-Page account management and resource usage functions was abstracted and generalizedinto GridPort GridPort was then enhanced to support multiple application portals on asingle Grid with a single-login environment The usefulness of this system has been suc-
cessfully demonstrated with the implementation and development of several production
application portals
Trang 4The driving philosophy behind the design of the HotPage and GridPort is the tion that many potential Grid users and developers will benefit from portals and portaltechnologies that provide universal, easy access to resource information and usage whilerequiring minimal work by Grid developers Users of GridPort-based portals are notrequired to perform any software downloads or configuration changes; they can use theGrid resources and services via common Web browsers Developers of Grid portals canavoid many of the complexities of the APIs of Grid middleware by using GridPort and
convic-similar toolkits Design decisions were thus guided by the desire to provide a generalized
infrastructure that is accessible to and useable by the computational science community
If every Grid portal developer or user were required to install Web technologies, tal software, and Grid middleware in order to build and use portals, there would be atremendous duplication of effort and unnecessary complexity in the resulting network ofconnected systems
por-GridPort attempts to address these issues by meeting several key design goals:
• Universal access: enables Web-based portals that can run anywhere and any time,
that do not require software downloads, plug-ins, or helper applications, and thatwork with ‘old’ Web browsers that do not support recent technologies (e.g client-side XML)
• Dependable information services: provide portals, and therefore users, with centralized
access to comprehensive, accurate information about Grid resources
• Common Grid technologies and standards: minimize impact on already burdened
re-source administrators by not requiring a proprietary GridPort daemon on HPC rere-sources
• Scalable and flexible infrastructure: facilitates adding and removing application portals,
Grid software systems, compute and archival resources, services, jobs, users, and so on
• Security : uses GSI, support HTTPS/SSL (Secure Sockets Layer) encryption at all
lay-ers, provide access control, and clean all secure data off the system as soon as possible
• Single login: requires only a single login for easy access to and navigation between
Grid resources
• Technology transfer : develops a toolkit that portal developers can easily download,
install, and use to build portals
• Global Grid Forum standards: adhere to accepted standards, conventions, and best
practices
• Support for distributed client applications and portal services: enables scientists to
build their own application portals and use existing portals for common ture services
infrastruc-Adhering to these design goals and using the lessons learned from building severalproduction Grid portals resulted in a Grid portal toolkit that is generalized and scal-able The GridPort project has met all of the goals listed above with the exception ofthe last one A Web services–based architecture, in which clients host Grid applicationportals on local systems and access distributed Grid Web services, will address the lastdesign goal As these Grid portal toolkits continue to evolve, they will enable developers,and even users, to construct more general Grid applications that use the Grid resourcesand services
Trang 528.1.3 Grid portal users and developers
The Grid is just beginning the transition to deployment in production environments, sothere are relatively few Grid users at this time As Grids move into production, users willrequire much assistance in trying to develop applications that utilize them
In the NPACI HPC/science environment, there are three general classes of potentialGrid users First, there are end users who only run prepackaged applications, most com-monly launched in Unix shell windows on the HPC resources Adapting these applications
to the Grid by adding a simple Web interface to supply application-specific parametersand execution configuration data is straightforward Therefore, this group will be easiestfor transition to using Grids instead of individual resources, but most of the work falls onthe Grid portal developers These users generally know little about the HPC systems theycurrently use – just enough to load input data sets, start the codes, and collect output datafiles This group includes users of community models and applications (e.g GAMESS,NASTRAN, etc.) Many of the users in this group may never know (or care) anythingabout HPC or how parallel computers are running their code to accomplish their sci-ence For them, the application code is a virtual laboratory Additionally, there exists alarge constituency that is absent from the HPC world because they find even this modestlevel of HPC knowledge to be intimidating and/or too time-consuming However, with
an intuitive, capable application portal, these researchers would not be exposed to theHPC systems or Grid services in order to run their application Effective Grid portals canprovide both novice and potential HPC users with simple, effective mechanisms to utilizeHPC systems transparently to achieve their research goals
The second group consists of researchers who are more experienced HPC users andwho often have accounts on multiple HPC systems Most current NPACI users fall intothis category For them, building HPC applications is challenging though tractable, but asscientists they prefer conducting simulations with production applications to developingnew code While this group will accept the challenges inherent in building parallel com-puting applications in order to solve their scientific problems, they are similar to the firstgroup: their first interest is in solving scientific problems For this group, a user portal likethe HotPage is ideal: it provides information on all the individual systems on which theyhave accounts It allows users to learn how to use each system, observe which systemsare available, and make an assessment of which system is the best for running their nextsimulations While they already have experience of using Unix commands and the com-mands native to each HPC system on which they have ported their codes, the HotPage
allows them to submit jobs, archive and retrieve data, and manage files on any of these from a single location For these users, a user portal like the HotPage cannot replace their
use of the command line environment of each individual system during periods of codedevelopment and tuning, but it can augment their usage of the systems for productionruns in support of their research
The third group of HPC users in our science environment includes the researcherswho are computational experts and invest heavily in evaluating and utilizing the latestcomputing technologies This group is often at least as focused on computing technologies
as on the applications science This group has programmers who are ‘early adopters’, and
so in some cases have already begun investigating Grid technologies Users in this group
Trang 6may benefit from a user portal, but they are more likely to build a Grid application usingtheir base HPC application and integrating Grid components and services directly Inaddition, there are also Grid developers who develop portals, libraries, and applications.For these Grid users and for Grid developers, a Grid application toolkit is ideal: somethinglike GridPort, but enhanced to provide greater flexibility for applications in general, notjust portals.
28.2 THE GRID PORTAL TOOLKIT (GRIDPORT)
GridPort has been the portal software toolkit used for the PACI and NPACI HotPage userportals and for various application portals since 1999 It was developed to support NPACIscientific computing objectives by providing centralized services such as secure access todistributed Grid resources, account management, large file transfers, and job managementvia Web-based portals that operate on the NSF computational Grid
Implementation of these services requires the integration and deployment of a large anddiverse number of Web and Grid software programs and services, each with a differentclient and server software and APIs Furthermore, the Web and the Grid are continuallyevolving, making this task not only challenging but also requiring constant adaptation tonew standards and tools GridPort evolved out of the need to simplify the integration ofthese services and technologies for portal developers As additional application portalsfor the NPACI user community were constructed, the need for an architecture that wouldprovide a single, uniform API to these technologies and an environment that would supportmultiple application portals emerged GridPort was designed to provide a common sharedinstance of a toolkit and its associated services and data (such as user account information,session data, and other information), and to act as a mediator between client requests andGrid services
28.2.1 GridPort architecture
The GridPort design is based on a multilayered architecture On top there exists a clientlayer (e.g Web browsers) and beneath it is a portal layer (the actual portals that formatthe content for the client) On the bottom is a backend services layer that connects todistributed resources via Grid technologies such as Globus GridPort is a portal serviceslayer that mediates between the backend and the portal layers (see Figure 28.1) GridPortmodularizes each of the steps required to translate the portal requests into Grid servicefunction calls, for example, a GRAM submission In the context of the architecture of
Foster et al [10] that describes the ‘Grid anatomy’, GridPort represents an API that
supports applications (portals) and interfaces to the Collective and Resources layers In
a sense, a producer/consumer model exists between each layer Each layer represents alogical part of the system in which data and service requests flow back and forth andaddresses some specific aspect or function of the GridPort portal system For example,GridPort consumes Grid services from a variety of providers (e.g via the Globus/GRAMGatekeeper); as a service provider, GridPort has multiple clients such as the HotPage
Trang 7NBCR GAMESS application portal gridport.npacl.edu/GAMESS/
NPACL Telescience application portal gridport.npacl.edu/telescience/
PORTALS.NPACL.EDU
Interactive services Informational services
Authentication
SDSC cert repository
Portal user services
File/data management
Job management Information services CRON SSH
NPACI Alllance PSC NASA/IPG Clusters Workstations Archlval
CACL account -cration Myproxy GSI / interative
GSI-FTP future SRB storage resources broker Globus GRAM GIS GBS GRIS
OPACI Portals Portal services Grid services
webserver filespace
Portal user file space
Local host
up download
Information cache
On top is the client layer and beneath it is a portal layer On the bottom is the backend services
layer that connects to distributed resources via Grid technologies GridPort is the portal services
layer that mediates between the backend and portal layers GridPort modularizes each of the steps required to translate the portal requests into Grid service function calls, for example, a GRAM submission.
and other application portals that use the same instance of GridPort to submit a job Wedescribe each of the layers and their functions below
Client layer : The client layer represents the consumers of Grid computing portals,
typ-ically Web browsers, Personal Digital Assistants (PDAs), or even applications capable
of pulling data from a Web server Typically, clients interact with the portal via HTMLform elements and use HTTPS to submit requests Owing to limitations in client-levelsecurity solutions, application portals running at different institutions other than the Grid-Port instance are not currently supported This limitation can now be addressed, however,owing to the advent of Web services that are capable of proxy delegation and forwarding.The issue is discussed in further detail in Section 28.5
Portals layer : The portals layer consists of the portal-specific code itself Application
portals run on standard Web servers and process the client requests and the responses
to those requests One instance of GridPort can support multiple concurrent applicationportals, but they must exist on the same Web server system in which they share the sameinstance of the GridPort libraries This allows the application portals to share portal-related user and account data and thereby makes possible a single-login environment.These portals can also share libraries, file space, and other services Application portalsbased on GridPort are discussed in more detail in Section 28.3
Portal services layer : GridPort and other portal toolkits or libraries reside at the
por-tal services layer GridPort performs common services for application porpor-tals including
Trang 8management of session state, portal accounts, and file collections and monitoring of Gridinformation services (GIS) Globus Metacomputing Directory Service (MDS) GridPortprovides portals with tools to implement both informational and interactive services asdescribed in Section 28.1.
Grid services (technologies) layer: The Grid services layer consists of those software
components and services that are needed to handle requests being submitted by software
to the portal services layer Wherever possible, GridPort employs simple, reusable dleware technologies, such as Globus/GRAM Gatekeeper, used to run interactive jobs andtasks on remote resources [9]; Globus GSI and MyProxy, used for security and authen-tication [11]; Globus GridFTP and the SDSC Storage Resource Broker (SRB), used fordistributed file collection and management [12]; and GIS based primarily on proprietaryGridPort information provider scripts and the Globus MDS 2.1 – Grid Resource Infor-mation System (GRIS) Resources running any of the above can be added to the set ofGrid resources supported by GridPort by incorporating the data about the system intoGridPort’s configuration files
mid-Resources layer : GridPort-hosted portals can be connected to any system defined in the
local configuration files, but interactive capabilities are only provided for GSI-enabledsystems For example, on the PACI HotPage [13], the following computational systemsare supported: multiple IBM SPs and SGI Origins, a Compaq ES-45 cluster, an IBMRegatta cluster, a Sun HPC10000, a Cray T3E, a Cray SV1, an HP V2500, an Intel LinuxCluster, and a Condor Flock Additionally, via the GSI infrastructure it is possible toaccess file archival systems running software compatible with the Public Key Infrastruc-ture/Grid Security Infrastructure (PKI/GSI) certificate system, such as GridFTP or SRB.GridPort supports resources located across organizational and geographical locations, such
as NASA/IPG, NPACI, and the Alliance
GridPort is available for download from the project Website [8] The latest version ofGridPort has been rewritten as a set of Perl packages to improve modularity and is based
on a model similar to the Globus Commodity Grid project that also provides CoG Kits
in Java, Python, and CORBA [14, 15]
28.2.2 GridPort capabilities
GridPort provides the following capabilities for portals
Portal accounts: All portal users must have a portal account and a valid PKI/GSI
certifi-cate (Note that these accounts are not the same as the individual accounts a user needs
on the resources.) The portal manages the user’s account and keeps track of sessions, userpreferences, and portal file space
Authentication: Users may authenticate GridPort portals using either of two mechanisms:
by authenticating against certificate data stored in the GridPort repository or by using aMyProxy server GridPort portals can accept certificates from several sites; for example,
Trang 9NPACI GridPort portals accept certificates from the Alliance, NASA/IPG, Cactus, andGlobus as well as NPACI Once a user is logged in to a GridPort portal and has beenauthenticated, the user has access to any other GridPort-hosted portal that is part of thesingle-login environment Thus, a portal user can use any Grid resource with the samepermissions and privileges as if he had logged into each directly, but now must onlyauthenticate through the portal.
Jobs and command execution: Remote tasks are executed via the Globus/GRAM
Gate-keeper, including compiling and running programs, performing process and job submissionand deletion, and viewing job status and history Additionally, users may execute simpleUnix-type commands such as mkdir, ls, rmdir, cd, and pwd
Data, file, and collection management : GridPort provides file and directory access to
compute and archival resources and to portal file space Using GSI-FTP (file transferprotocol) and SRB commands, GridPort enables file transfer between the local workstationand the remote Grid resources as well as file transfers between remote resources GridPortalso supports file and directory access to a user’s file space on the Grid resource
Information services: GridPort provides a set of information services that includes the
status of all systems, node-level information, batch queue load, and batch queue jobsummary data Data is acquired via the MDS 2.0 GIIS/GRIS or via custom informationscripts for those systems without MDS 2.0 installed Portals can use the GridPort GIS togather information about resources on the Grid for display to the users (as in the HotPageportal) or use the data to influence decisions about jobs that the portal will execute
28.2.3 GridPort security
Secure access at all layers is essential and is one of the more complex issues to address Allconnections require secure paths (SSL, HTTPS) between all layers and systems involved,including connections between the client’s local host and the Web server and between theWeb server and the Grid resources that GridPort is accessing Security between the clientand the Web server is handled via SSL using an RC4-40 128-bit key and all connections
to backend resources are SSL-encrypted and use GSI-enabled software wherever required.GridPort does not use an authorization service such as Akenti [16], but there are plans tointegrate such services in the future versions
The portal account creation process requires the user to supply the portal with a digitalGSI certificate from a known Certificate Authority (CA) Once the user has presented thiscredential, the user will be allowed to use the portal with the digital identity containedwithin the certificate
Logging into the portal is based on the Globus model and can be accomplished inone of two ways The Web server may obtain a user ID and a passphrase and attempt to
create a proxy certificate using globus-proxy-init based on certificate data stored in a local
repository Alternatively, the proxy may be generated from a proxy retrieved on behalf
of the user from a MyProxy server
Trang 10If the proxy-init is successful, session state is maintained between the client and theWeb server with a session cookie that is associated with a session file Access restrictionsfor cookie retrieval are set to be readable by any portal hosted on the same domain As
a result, the user is now successfully authenticated for any portal that is hosted using thesame domain, allowing GridPort to support a single login environment Session files andsensitive data including user proxies are stored in a restricted access repository on theWeb server The repository directory structure is located outside Web server file spaceand has user and group permissions set such that no user except the Web server daemonmay access these files and directories Furthermore, none of the files in these directoriesare allowed to be executable, and the Web server daemon may not access files outsidethese directories The session file also contains a time stamp that GridPort uses to expireuser login sessions that have been inactive for a set period of time
Grid task execution is accomplished using the GSI model: when a portal uses GridPort
to make a request on behalf of the user, GridPort presents the user’s credentials to the Gridresource, which decides, on the basis of the local security model, whether the request will
be honored or denied Thus, the portal acts as a proxy for executing requests on behalf ofthe user (on resources that the user is authorized to access) on the basis of the credentialspresented by the user who created the portal account Therefore, portal users have thesame level of access to a particular resource through the portal as they would if theylogged into the resource directly
28.3 GRIDPORT PORTALS
As the number of portals using GridPort increased, the complexity of supporting them alsoincreased The redesign of GridPort as a shared set of components supporting centralizedservices (e.g common portal accounts and common file space) benefited both developersand users: it became much simpler for developers to support multiple application portalsbecause code, files, and user accounts were shared across all portals, while users only had
to sign in once to a single account in order to gain access to all portals
Figure 28.2 depicts the GridPort approach used for implementing portals The diagramshows the relationships between multiple application portals residing on the same machineand accessing the same instance of GridPort In this example, all the application portalsare hosted on the *.npaci.edu domain The Web servers for these URLs and the applicationportal software reside on the same physical machine and have access to the shared GridPortfile space Each portal also has its own file space containing specialized scripts and
data The portal developer incorporates the GridPort libraries directly into the portal
code, making subroutine calls to GridPort software in order to access the functionalitythat GridPort provides The application portal software is responsible for handling theHTTP/CGI (Common Gateway Interface) request, parsing the data, formatting the request,and invoking GridPort when appropriate Furthermore, since GridPort is an open source,the application portal developers at a given domain may modify GridPort to suit theirneeds GridPort intentionally decouples handling HTTP/CGI data so that clients can makeGridPort requests using other languages (e.g Java, PHP, etc.)
Trang 11hotpage.npaci.edu gridport.npaci.edu/GAMESS gridort.npaci.edu/LAPK gridort.npaci.edu/Telescience
HotPage portal
CGI scripts
GAMESS portal CGI scripts
LAPK portal CGI scripts
Telescience portal CGI scripts
GridPort toolkit Authentication/accounts
Webserver machine (*.npaci.edu)
Telescience data LAPK
data GAMESS
data HotPage
data
Portal
data
Portal account repository
Figure 28.2 GridPort multiportal architecture diagram showing the method in which multiple portals can be installed on the same Web server machine In this design, each portal has its own file space and shares the same instance of the GridPort modules as well as common portal account and authorization information All access to the Grid is done through functions provided by GridPort and each request is authenticated against the account data.
Each portal request is first checked by authentication mechanisms built into GridPort;
if the user is logged in, then the request will be passed on to the correct Grid service.Results are passed back to the requesting application portal, which has the responsibility
of formatting and presenting results to the user
There are several types of portals based on GridPort [17], examples of which includeuser portals, such as the NPACI and PACI HotPages, in which the user interacts directlywith the HPC resources; community model application portals, such as the Laboratoryfor Applied Pharmacokinetics (LAPK) portal [18], which hide the fact that a Grid or anHPC resource is being used; remote instrument application portals, such as the UCSDTelescience portal, in which users control remote equipment and migrate data acrossinstitutional boundaries [19]; and systems of portals such as those being constructed bythe NSF funded National Biomedical Computation Resource (NBCR) program [the ProteinData Bank (PDB) Combinatorial Extension (CE) portal, the GAMESS and Amber portals,and others] that share data to achieve a common research objective [20, 21] In addition,there are a number of remote sites that have ported GridPort and installed local versions
of the HotPage [22–25]
Each of these portal types demonstrates different motivations that computational tists have for using computational Grid portals Descriptions of examples of these portaltypes are given below These examples illustrate some of the features and capabilities ofGridPort They also reveal some of the issues encountered in constructing Grid portalsowing to limitations of GridPort and the Grid services employed by GridPort or to the fact
Trang 12scien-that some necessary services have not yet been created These limitations are summarized
in Section 28.4 and plans to address them are discussed in Section 28.5
Grid computing portals are popular and easy to use because Web interfaces are nowcommon and well understood Because they are Web-based, GridPort portals are accessiblewherever there is a Web browser, regardless of the user’s location GridPort only requiresthat portals employ simple, commodity Web technologies, although more complex mech-anisms for requesting and formatting data may be used if desired For example, portalscan be built using a minimal set of Web technologies including HTML 3.0, JavaScript1.1, and the HTTPS protocol (Netscape Communicator or Microsoft Internet Explorerversions 4.0 or later), but GridPort portals may use technologies such as DHTML, Java,Python, XML, and so on
On the server side, common Web server software should be adequate to host based portals To date, GridPort has been demonstrated to work on Netscape EnterpriseServer running on Solaris and Apache running on Linux, but it should run on any Webserver system that supports CGI and HTTPS [26]
GridPort-28.3.1 HotPage user portal
The HotPage user portal is designed to be a single point of access to all Grid resourcesthat are represented by the portal for both informational and interactive services and has
a unique identity within the Grid community The layout of the HotPage allows a user
to view information about these resources from either the Grid or the individual resourceperspectives in order to quickly determine system-wide information such as operationalstatus, computational load, and available computational resources (see Figure 28.3) Theinteractive portion of the HotPage includes a file navigation interface and Unix shell-like tools for editing remote files, submitting jobs, archiving and retrieving files, andselecting multiple files for various common Unix operations With the introduction ofpersonalization in a new beta version of the HotPage, users can now choose which systemsthey wish to have presented for both informational and interactive services
The informational services provide a user-oriented interface to NPACI resources andservices These services provide on-line documentation, static informational pages, anddynamic information, including
• summary data of all resources on a single page,
• basic user information such as documentation, training, and news,
• real-time information for each machine, including operational status and utilization ofall resources,
• summaries of machine status, load, and batch queues,
• displays of currently executing and queued jobs,
• graphical map of running applications mapped to nodes,
• batch script templates for each resource
Published as part of the NPACI Portal GIS, much of this data is generated via ized custom scripts that are installed into the Globus Grid Resource Information Services(GRIS running on MDS 2.0 on each resource) and are also published via the HotPage
Trang 13special-Figure 28.3 Screenshot of the NPACI HotPage, during an interactive session (user is logged in).
In this example, the user is preparing to submit an MPI job to the Blue Horizon queue Results of the job request will be displayed in a pop-up window.
The GIS service model allows us to share data with or to access data from other portal
or Grid systems such as the Alliance/NCSA or the Cactus Grids [27, 28]
The HotPage interactive interface enables users to
• view directory listings and contents on all computational and storage systems on whichthey have accounts, including hyperlinked navigation through directories and display
of text files;
• build batch job scripts to be submitted to the queues on the computational resources;
• execute interactive processes on the computational resources;
• perform file management operations on remote files (e.g tar/untar, gzip/gunzip, etc.)and move files to and from the local host, HPC resources, archival storage resources,and SRB collections;
• manage NPACI accounts
Each of the capabilities listed above is provided using Perl scripts that employ GridPortmodules and can be extended by the portal application developer Thus, the HotPage soft-ware consists of static HTML pages, server-side includes to simplify dynamic construction
of other HTML pages, and Perl scripts to actually build the dynamic pages
28.3.2 Laboratory for Applied Pharmacokinetics modeling portal
GridPort is being used to create a portal for medical doctors who run a drug dosagemodeling package developed as a result of a collaboration between the Laboratory for