the internet encyclopedia volume phần 9 potx

That is,when a user submits a query, the user does not need to be aware that multiple search engines may be used to process this query, and when the user receives thesearch result from t

Trang 1

Yu WL040/Bidgolio-Vol I WL040-Sample.cls June 20, 2003 17:52 Char Count= 0

W EB S EARCH T ECHNOLOGY

750

document selector is to utilize the fact that most search

engines return retrieved results in groups Usually, only

the top 10 to 20 results are returned in the ﬁrst result page

but the user can make additional requests for more result

pages and more results Hence, a document selector may

ask each search engine to return the ﬁrst few result pages

This method tends to return the same number of pages

from each selected search engine Since different search

engines may contain different numbers of useful pages for

a given query, retrieving the same number of pages from

each search engine is likely to cause over-retrieval from

less useful databases and under-retrieval from highly

use-ful databases

More elaborate document selection methods try to tie

the number of pages to retrieve from a search engine to the

ranking score (or the rank) of the search engine relative

to the ranking scores (or ranks) of other search engines

This can lead to proportionally more pages to be retrieved

from search engines that are ranked higher or have higher

ranking scores This type of approach is referred to as a

weighted allocation approach in (Meng et al., 2002).

For each user query, the database selector of the

metasearch engine computes a rank (i.e., 1st, 2nd, .)

and a ranking score for each local search engine Both

the rank information and the ranking score information

can be used to determine the number of pages to retrieve

from different local search engines For example, in the

D-WISE system (Yuwono & Lee, 1997), the ranking score

information is used Suppose for a given query q, r i

de-notes the ranking score of the local database D i , i= 1, ,

k, where k is the number of selected local databases for

the query, andα =k

j=1r j denotes the total ranking score

for all selected local databases D-WISE uses the ratio r i/α

to determine how many pages should be retrieved from

D i More precisely, if m pages across these k databases

are to be retrieved, then D-WISE retrieves m ∗ r i/α pages

from database D i An example system that uses the rank

information to select documents is CORI Net (Callan

et al., 1995) Speciﬁcally, if m is the total number of pages

to be retrieved from k selected local search engines, then

for u < v, more pages will be retrieved from the uth ranked

database than from the vth ranked database Because

exactly m pages will be retrieved from the k top-ranked

databases In practice, it may be wise to retrieve slightly

more than mpages from local databases in order to reduce

the likelihood of missing useful pages

It is possible to combine document selection and

database selection into a single integrated process In

Database Selection, we described a method for ranking

databases in descending order of the estimated ity of the most similar document in each database for

similar-a given query A combined dsimilar-atsimilar-absimilar-ase selection similar-and

doc-ument selection method for ﬁnding the m most similar

pages based on these ranked databases was proposed in

Yu et al (1999) This method is sketched below First, for

some small positive integer s (e.g., s can be 2), each of the

stop-ranked databases are searched to obtain the actual

global similarity of its most similar page This may quire some locally top-ranked pages to be retrieved from

re-each of these databases Let min sim be the minimum of these s similarities Next, from these s databases, retrieve

all pages whose actual global similarities are greater than

or equal to min sim If m or more pages have been

re-trieved, then sort them in descending order of similarities,

return the top m pages to the user, and terminate this

pro-cess Otherwise, the next top ranked database (i.e., the

(s+ 1)th ranked database) is considered and its most ilar page is retrieved The actual global similarity of this

sim-page is then compared with the current min sim and the

minimum of these two similarities will be used as the

new min sim Then retrieve from these s+ 1 databasesall pages whose actual global similarities are greater than

or equal to the new min sim This process is repeated til m or more pages are retrieved and the m pages with

un-the largest similarities are returned to un-the user A ing problem with this combined method is that the samedatabase may be searched multiple times In practice, thisproblem can be avoided by retrieving and caching an ap-propriate number of pages when a database is searchedfor the ﬁrst time In this way, all subsequent “interactions”with the database would be carried out using the cachedresults This method has the following property (Yu et al.,

seem-1999) If the databases containing the m desired pages are

ranked higher than other databases and the similarity (or

desirability) of the mth most similar (desirable) page is distinct, then all of the m desired pages will be retrieved

while searching at most one database that does not

con-tain any of the m desired pages.

Result Merging

Ideally, a metasearch engine should provide local systemtransparency to its users From a user’s point of view,such a transparency means that a metasearch searchshould behave like a regular search engine That is,when a user submits a query, the user does not need

to be aware that multiple search engines may be used

to process this query, and when the user receives thesearch result from the metasearch engine, he/she should

be hidden from the fact that the results are retrievedfrom multiple search engines Result merging is a nec-essary task in providing the above transparency Whenmerging the results returned from multiple search en-gines into a single result, pages in the merged resultshould be ranked in descending order of global similari-ties (or global desirabilities) However, the heterogeneitiesthat exist among local search engines and between themetasearch engine and local search engine make resultmerging a challenging problem Usually, pages returnedfrom a local search engine are ranked based on thesepages’ local similarities Some local search engines makethe local similarities of returned pages available to the

Trang 2

user (as a result, the metasearch engine can also

ob-tain the local similarities) while other search engines

do not make them available For example, Google and

AltaVista do not provide local similarities while Northern

Light and FirstGov do To make things worse, local

simi-larities returned from different local search engines, even

when made available, may be incomparable due to the

use of different similarity functions and term-weighting

schemes by different local search engines Furthermore,

the local similarities and the global similarity of the same

page may be quite different still as the metasearch engine

may use a similarity function different from those used in

local systems In fact, even when the same similarity

func-tion were used by all local systems and the metasearch

engine, local and global similarities of the same page may

still be very different This is because some statistics used

to compute term weights, for example the document

fre-quency of a term, are likely to be different in different

systems

The challenge here is how to merge the pages returnedfrom multiple local search engines into a single ranked list

in a reasonable manner in the absence of local similarities

and/or in the presence of incomparable similarities An

additional complication is that retrieved pages may be

returned by different numbers of local search engines For

example, one page could be returned by one of the selected

local search engines and another may be returned by all of

them The question is whether and how this should affect

the ranking of these pages

Note that when we say that a page is returned by asearch engine, we really mean that the URL of the page

is returned One simple approach that can solve all of the

above problems is to actually fetch/download all returned

pages from their local servers and compute their global

similarities in the metasearch engine One metasearch

engine that employs this approach for result merging

is the Inquirus system (http://www.neci.nec.com/∼

lawrence/inquirus.html) Inquirus ranks pages returned

from local search engines based on analyzing the

con-tents of downloaded pages, and it employs a ranking

formula that combines similarity and proximity matches

(Lawrence & Lee Giles, 1998) In addition to being able

to rank results based on desired global similarities, this

approach also has some other advantages (Lawrence

& Lee Giles, 1998) For example, when attempting to

download pages, obsolete URLs can be discovered This

helps to remove pages with dead URLs from the ﬁnal

result list In addition, downloading pages on the ﬂy

ensures that pages will be ranked based on their current

contents In contrast, similarities computed by local

search engines may be based on obsolete versions of Web

pages The biggest drawback of this approach is its slow

speed as fetching pages and analyzing them on the ﬂy

can be time consuming

Most result merging methods utilize the local ties or local ranks of returned pages to perform merging

similari-The following cases can be identiﬁed:

Selected Databases for a Given Query Do Not Share

Pages, and All Returned Pages Have Local Similarities

Attached. In this case, each result page will be returned

from just one search engine Even though all returned

pages have local similarities, these similarities may be malized using different ranges by different local search en-gines For example, one search engine may normalize itssimilarities between 0 and 1 and another between 0 and

nor-1000 In this case, all local similarities should be malized based on a common range, say [0, 1], to improvethe comparability of these local similarities (Dreilinger &Howe, 1997; Selberg & Etzioni, 1997)

renor-Renormalized similarities can be further adjustedbased on the usefulness of different databases for thequery Recall that when database selection is performedfor a given query, the usefulness of each database is esti-mated and is represented as a score The database scorescan be used to adjust renormalized similarities The idea

is to give preference to pages retrieved from highly rankeddatabases In CORI Net (Callan et al., 1995), the adjust-

ment works as follows Let s be the ranking score of cal database D and s be the average of the scores of all

lo-searched databases for a given query Then the following

weight is assigned to D : w = 1 + k * (s − s)/s, where k

is the number of databases searched for the given query

It is easy to see from this formula that databases with

higher scores will have higher weights Let x be the malized similarity of page p retrieved from D Then CORI Net computes the adjusted similarity of p by w * x The re-

renor-sult merger lists returned pages in descending order of justed similarities A similar method is used in ProFusion(Gauch et al., 1996) For a given query, the adjusted sim-

ad-ilarity of a page p from a database D is the product of the renormalized similarity of p and the ranking score of

D.

Selected Databases for a Given Query Do Not Share Pages, but Some Returned Pages Do Not Have Local Similarities Attached. Again, each result page will be re-turned by one local search engine In general, there aretwo types of approaches for tackling the result-mergingproblem in this case The ﬁrst type uses the local rankinformation of returned pages directly to perform themerge Note that in this case, local similarities that may

be available for some returned pages would be ignored.The second type ﬁrst converts local ranks to local simi-larities and then applies techniques described for the ﬁrstcase to perform the merge

One simple way to use rank information only for resultmerging is as follows (Meng et al., 2002) First, arrangethe searched databases in descending order of usefulnessscores Next, a round-robin method based on the databaseorder and the local page rank order is used to produce

an overall rank for all returned pages Speciﬁcally, inthe ﬁrst round, the top-ranked page from each searcheddatabase is taken and these pages are ordered based on thedatabase order such that the page order and the databaseorder are consistent; if not enough pages have been ob-tained, the second round starts, which takes the secondhighest-ranked page from each searched database, ordersthese pages again based on the database order, and placesthem behind those pages selected earlier This process isrepeated until the desired number of pages is obtained

In the D-WISE system (Yuwono & Lee, 1997), the lowing method for converting ranks into similarities is

fol-employed For a given query, let r be the ranking score of

Trang 3

W EB S EARCH T ECHNOLOGY

752

database D i , r min be the smallest database ranking score, r

be the local rank of a page from Di , and g be the converted

similarity of the page The conversion function is g= 1 −

(r − 1) * F i , where F i = r min /(m * r i ) and m is the number

of documents desired across all searched databases This

conversion has the following properties First, all locally

top-ranked pages have the same converted similarity, i.e.,

1 Second, F iis the difference between the converted

sim-ilarities of the jth and the ( j+ 1)th ranked pages from

database D i , for any j= 1, 2, Note that the distance is

larger for databases with smaller ranking scores

Conse-quently, if the rank of a page p in a higher rank database is

the same as the rank of a page pin a lower rank database

and neither p nor pis top-ranked, then the converted

sim-ilarity of p will be higher than that of p This property can

lead to the selection of more pages from databases with

higher scores into the merged result As an example,

con-sider two databases D1and D2 Suppose r1= 0.2, r2= 0.5,

and m = 4 Then r min = 0.2, F1= 0.25, and F2= 0.1 Thus,

the three top-ranked pages from D1will have converted

similarities 1, 0.75, and 0.5, respectively, and the three

top-ranked pages from D2will have converted similarities 1,

0.9, and 0.8, respectively As a result, the merged list will

contain three pages from D2and one page from D1

Selected Databases for a Given Query Share Pages. In

this case, the same page may be returned by multiple local

search engines Result merging in this situation is usually

carried out in two steps In the ﬁrst step, techniques

dis-cussed in the ﬁrst two cases can be applied to all pages,

regardless of whether they are returned by one or more

search engines, to compute their similarities for merging

In the second step, for each page p returned by

multi-ple search engines, the similarities of p due to multimulti-ple

search engines are combined in a certain way to

gener-ate a ﬁnal similarity for p Many combination functions

have been proposed and studied (Croft, 2000), and some of

these functions have been used in metasearch engines For

example, the max function is used in ProFusion (Gauch

et al., 1996), and the sum function is used in MetaCrawler

(Selberg & Etzioni, 1997)

CONCLUSION

In the past decade, we have all witnessed the explosion

of the Web Up to now, the Web has become the largest

digital library used by millions of people Search engines

and metasearch engines have become indispensable tools

for Web users to ﬁnd desired information

While most Web users probably have used search

en-gines and metasearch enen-gines, few know the technologies

behind these wonderful tools This chapter has provided

an overview of these technologies, from basic ideas to

more advanced algorithms As can be seen from this

chap-ter, Web-based search technology has its roots from text

retrieval techniques, but it also has many unique features

Some efforts to compare the quality of different search

engines have been reported (for example, see (Hawking,

Craswell, Bailey, & Grifﬁths, 2001)) An interesting issue is

how to evaluate and compare the effectiveness of different

techniques Since most search engines employ multiple

techniques, it is difﬁcult to isolate the effect of a particular

technique on effectiveness even when the effectiveness ofsearch engines can be obtained

Web-based search is still a pretty young discipline, and

it still has a lot of room to grow The upcoming transition

of the Web from mostly HTML pages to XML pages willprobably have a signiﬁcant impact on Web-based searchtechnology

ACKNOWLEDGMENT

This work is supported in part by NSF GrantsIIS-9902872, IIS-9902792, EIA-9911099, IIS-0208574,IIS-0208434 and ARO-2-5-30267

GLOSSARYAuthority page A Web page that is linked from hubpages in a group of pages related to the same topic

Collection fusion A technique that determines how

to retrieve documents from multiple collections andmerge them into a single ranked list

Database selection The process of selecting potentiallyuseful data sources (databases, search engines, etc.) foreach user query

Hub page A Web page with links to important ity) Web pages all related to the same topic

(author-Metasearch engine A Web-based search tool that lizes other search engines to retrieve information forits user

uti-PageRank A measure of Web page importance based onhow Web pages are linked to each other on the Web

Search engine A Web-based tool that retrieves tially useful results (Web pages, products, etc.) for eachuser query

poten-Result merging The process of merging documents trieved from multiple sources into a single ranked list

re-Text retrieval A discipline that studies techniques toretrieve relevant text documents from a documentcollection for each query

Web (World Wide Web) Hyperlinked documents ing on networked computers, allowing users to navi-gate from one document to any linked document

hid-Bruce Croft (Ed.), Advances in information retrieval:

Re-cent research from the Center for Intelligent Information Retrieval (pp 127–150) Dordrecht, The Netherlands:

Kluwer Academic

Callan, J., Connell, M., & Du, A (1999) Automatic

dis-covery of language models for text databases In ACM

SIGMOD Conference (pp 479–490) New York: ACM

Press

Trang 4

Callan, J., Croft, W., & Harding, S (1992) The INQUERY

retrieval system In Third DEXA Conference, Valencia,

Spain (pp 78–83) Wien, Austria: Springer-Verlag.

Callan, J., Lu, Z., & Croft, W (1995) Searching

dis-tributed collections with inference networks In ACM

SIGIR Conference, Seattle (pp 21–28) New York: ACM

Press

Chakrabarti, S., Dom, B., Raghavan, P., Rajagopalan, S.,

Gibson, D., Kleinberg, J (1998) Automatic resourcecompilation by analyzing hyperlink structure and asso-

ciated text In 7th International World Wide Web

Confer-ence, Brisbane, Australia (pp 65–74) Amsterdam, The

Netherlands: Elsevier

Chakrabarti, S., Dom, B., Kumar, R., Raghavan, P.,

Rajagopalan, S., et al (1999) Mining the Web’s link

structure IEEE Computer, 32, 60–67.

Croft, W (2000) Combining approaches to information

retrieval In W Bruce Croft (Ed.), Advances in

infor-mation retrieval: Recent research from the Center for Intelligent Information Retrieval (pp 1–36) Dordrecht:

Kluwer Academic

Cutler, M., Deng, H., Manicaan, S., & Meng, W (1999)

A new study on using HTML structures to improve

retrieval In Eleventh IEEE Conference on Tools with

Artiﬁcial Intelligence, Chicago (pp 406–409)

Washing-ton, DC: IEEE Computer Society

Dreilinger, D., & Howe, A (1997) Experiences with

selecting search engines using metasearch ACM

Transactions on Information Systems, 15, 195–222.

Fan, Y., & Gauch, S (1999) Adaptive agents for

infor-mation gathering from multiple, distributed

informa-tion sources In AAAI Symposium on Intelligent Agents

in Cyberspace, Stanford University (pp 40–46) Menlo

Park, CA: AAAI Press

Gauch, S., Wang, G., & Gomez, M (1996) ProFusion:

Intelligent fusion from multiple, distributed search

engines Journal of Universal Computer Science, 2, 637–

649

Gravano, L., Chang, C., Garcia-Molina, H., & Paepcke,

A (1997) Starts: Stanford proposal for Internet

meta-searching In ACM SIGMOD Conference, Tucson,

AZ (pp 207–218) New York: ACM Press.

Hawking, D., Craswell, N., Bailey, P., & Grifﬁths, K (2001)

Measuring search engine quality Journal of

Informa-tion Retrieval, 4, 33–59.

Hearst, M., & Pedersen, J (1996) Reexamining the

clus-ter hypothesis: Scatclus-ter/gather on retrieval results In

ACM SIGIR Conference (pp 76–84) New York: ACM

Press

Kahle, B., & Medlar, A (1991) An information system for

corporate users: Wide area information servers (Tech.

Rep TMC199) Thinking Machine Corporation.Kirsch, S (1998) The future of Internet search: Infoseek’s

experiences searching the Internet ACM SIGIR Forum,

32, 3–7 New York: ACM Press.

Kleinberg, J (1998) Authoritative sources in a

hyper-linked environment In Ninth ACM-SIAM Symposium

on Discrete Algorithms (pp 668–677) Washington, DC:

ACM–SIAM

Koster, M (1994) ALIWEB: Archie-like indexing in the

Web Computer Networks and ISDN Systems, 27, 175–

182

Lawrence, S., & Lee Giles, C (1998) Inquirus, the NECi

meta search engine In Seventh International World

Wide Web Conference (pp 95–105) Amsterdam, The

Netherlands: Elsevier

Manber, U., & Bigot, P (1997) The search broker

In USENIX Symposium on Internet Technologies and

Systems, Monterey, CA (pp 231–239) Berkeley, CA:

USENIX

Meng, W., Yu, C., & Liu, K (2002) Building efﬁcient and

effective metasearch engines ACM Computing Surveys,

34, 48–84.

Page, L., Brin, S., Motwani, R., & Winograd, T (1998)

The PageRank citation ranking: Bring order to the Web

(Technical Report) Stanford, CA: Stanford University.Pratt, W., Hearst, H., & Fagan, L (1999) A knowledge-based approach to organizing retrieved documents In

Sixteenth National Conference on Artiﬁcial Intelligence

(pp 80–85) Menlo Park, CA: AAAI Press and bridge, MA: MIT Press

Cam-Salton, G., & McGill, M (1983) Introduction to modern

information retrieval New York: McCraw-Hill.

Selberg, E., & Etzioni, O (1997) The MetaCrawler

ar-chitecture for resource aggregation on the Web IEEE

Expert, 12, 8–14.

Wu, Z., Meng, W., Yu, C., & Li, Z (2001) Towards ahighly scalable and effective metasearch engine In

Tenth World Wide Web Conference (pp 386–395) New

York: ACM Press

Yu, C., Meng, W., Liu, L., Wu, W., & Rishe, N (1999).Efﬁcient and effective metasearch for a large number

of text databases In Eighth ACM International

Con-ference on Information and Knowledge Management

(pp 217–214) New York: ACM Press

Yuwono, B., & Lee, D (1997) Server ranking for tributed text resource systems on the Internet In

dis-Fifth International Conference on Database Systems for Advanced Applications (pp 391–400) Singapore:

World Scientiﬁc

Trang 5

Sahai WL040/Bidgolio-Vol I WL040-Sample.cls July 16, 2003 18:35 Char Count= 0

Web Services

Akhil Sahai, Hewlett-Packard Laboratories Sven Graupner, Hewlett-Packard Laboratories Wooyoung Kim, University of Illinois at Urbana-Champaign

The Genesis of Web Services 754

Tightly Coupled Distributed Software

Convergence of the Two Independent Trends 755

Single Sign-On and Digital Passports 760Payment Systems for Web Services 762

The Future of Web Services 763Dynamic Web Services Composition and

End-to-End Web Service Interactions 764Future Web Services Infrastructures 765

There were two predominant trends in computing over

the past decade—(i) a movement from monolithic

soft-ware to distributed objects and components and (ii) an

increasing focus on software for the Internet Web

ser-vices (or e-serser-vices) are a result of these two trends

Web services are deﬁned as distributed services that are

identiﬁed by Uniform Resource Identiﬁers (URI’s), whose

interfaces and binding can be deﬁned, described, and

dis-covered by eXtensible Markup Language (XML) artifacts,

and that support direct XML message-based interactions

with other software applications over the Internet Web

services that perform useful tasks would often exhibit the

following properties:

Discoverable—The foremost requirement for a Web

ser-vice to be useful in commercial scenarios is that it be

discovered by clients (humans or other Web services)

Communicable—Web services adopt a message-driven

operational model where they interact with each other

and perform speciﬁed operations by exchanging XML

messages The operational model is thus referred to

as the Document Object Model (DOM) Some of

pre-eminent communication patterns that are being used

between Web services are synchronous, asynchronous,

and transactional communication

Conversational—Sending a document or invoking a

met-hod, and getting a reply are the basic communication

primitives in Web services A sequence of the

primi-tives that are related to each other (thus, conversation)

forms a complex interaction between Web services

Secure and Manageable—Properties such as security,

re-liability, availability, and fault tolerance are critical for

commercial Web services as well as manageability and

quality of service

As the Web services gain critical mass in the informationtechnology (IT) industry as well as academia, a dominantcomputing paradigm of that of software as a monolithicobject-oriented application is gradually giving way to soft-ware as a service accessible via the Internet

THE GENESIS OF WEB SERVICES

Contrary to general public perception, the development ofWeb services followed a rather modest evolutionary path.The underpinning technologies of Web services borrowheavily from object-based distributed computing and de-velopment of the World Wide Web (Berners-Lee, 1996)

In the chapter, we review related technologies that helpshape the notion of Web services

Tightly Coupled Distributed Software Architectures

The study of various aspects of distributed computing can

be dated back as early as the invention of time-shared tiprocessing Despite the early start, distributed comput-ing remained impractical until the introduction of ObjectManagement Group’s (OMG) Common Object RequestBroker Architecture (CORBA) and Microsoft’s DistributedComponent Object Model (DCOM), a distributed ex-tension to the Component Object Model (COM) BothCORBA and DCOM create an illusion of a single machineover a network of (heterogeneous) computers and allowobjects to invoke remote objects as if they were on thesame machine, thereby vastly simplifying object sharingamong applications They do so by building their abstrac-tions on more or less OS- and platform-independent mid-dleware layers In these software architectures, objects de-ﬁne a number of interfaces and advertise their services

mul-by registering the interfaces Objects are assigned ﬁers at the time of creation The identiﬁers are used for

identi-754

Trang 6

discovering their interfaces and their implementations In

addition, CORBA supports discovery of objects using

de-scriptions of the services they provide Sun Microsystems’

Java Remote Method Invocation (Java RMI) provides a

similar functionality, where a network of platform-neutral

Java virtual machines provides the illusion of a single

ma-chine Java RMI is a language-dependent solution, though

the Java Native Interface (JNI) provides language

inde-pendence to some extent

The software architectures supported by CORBA and

DCOM are said tightly coupled because they deﬁne their

own binary message encoding, and thus objects are

inter-operable only with objects deﬁned in the same software

architecture; for example, CORBA objects cannot invoke

methods on DCOM objects Also, it is worth noting that

security was a secondary concern in these software

archi-tectures—although some form of access control is highly

desirable—partly because method-level/object-level

ac-cess control is too ﬁne-grained and incurs too much

over-head, and partly because these software architectures

were developed for use within the boundary of a single

administrative domain, typically a local area network

Loosely Coupled Distributed

Software Architectures

Proliferation and increased accessibility of diverse

intel-ligent devices in today’s IT market have transformed the

World Wide Web to a more dynamic, pervasive

environ-ment The fundamental changes in computing landscape

from a static client-server model to a dynamic peer-to-peer

model encourage reasoning about interaction with these

devices in terms of more abstract notion of service rather

than a traditional notion of object For example, printing

can be viewed as a service that a printer provides;

print-ing a document is to invoke the print service on a printer

rather than to invoke a method on a proxy object for a

printer

Such services tend to be dispersed over a wide area,often crossing administrative boundaries, for better re-

source utilization This physical distribution calls for

more loosely coupled software architectures where

scal-able advertising and discovery are a must and low-latency,

high-bandwidth interprocessor communication is highly

desirable As a direct consequence, a number of

service-centric middleware developments have come to light

We note three distinctive systems from computer

in-dustry’s research laboratories, namely, HP’s client utility

(e-Speak), Sun Microsystems’ Jini, and IBM’s TSpaces

(here listed in the alphabetic order) These have been

im-plemented in Java for platform independence

Client Utility

HP’s client utility is a somewhat underpublicized system

that became the launching pad for HP’s e-Speak (Karp,

2001) Its architecture represents one of the earlier forms

of peer-to-peer system, which is suitable for Web service

registration, discovery, and invocation (Kim, Graupner, &

Sahai, 2002) The fundamental idea is to abstractly

repre-sent every element in computing as a uniform entity called

“service (or resource).” Using the abstraction as a building

block, it provides facilities for advertising and discovery,

dynamic service composition, mediation and ment, and capability-based ﬁne-grain security What dis-tinguishes client utility most from the other systems is thefact that it makes advertisement and discovery visible toclients Clients can describe their services using vocabu-laries and can speciﬁcally state what services they want todiscover

manage-Jini

The Jini technology at Sun Microsystems is a set of tocol speciﬁcations that allows services to announce theirpresence and discover other services in their vicinity It ad-vocates a network-centric view of computing However,

pro-it relies on the availabilpro-ity of multicast capabilpro-ity, tically limiting its applicability to services/devices con-nected with a local area network (such as home network).Jini exploits Java’s code mobility and allows a service to ex-port stub code which implements a communication proto-col using Java RMI Joining, advertisement, and discoveryare done transparently from other services It has been de-veloped mainly for collaboration within a small, trustedworkgroup and offers limited security and scalability sup-ports

prac-TSpaces

IBM’s TSpaces (TSpaces, 1999) is network middlewarethat aims to enable communication between applicationsand devices in a network of heterogeneous computers andoperating systems It is a network communication bufferwith database capabilities, which extends Linda’s Tuplespace communication model with asynchrony TSpacessupports hierarchical access control on the Tuple spacelevel Advertisement and discovery are implicit in TSpacesand provided indirectly through shared Tuple spaces

Convergence of the Two Independent Trends

Web services are deﬁned at the cross point of the evolutionpaths of service-centric computing and the World WideWeb The idea is to provide service-centric computing byusing the Internet as platform; services are delivered overthe Internet (or intranet) Since its inception, the WorldWide Web has strived to become a distributed, decentra-lized, all pervasive infrastructure where information is putout for other users to retrieve It is this decentralized,distributed paradigm of information dissemination thatupon meeting the concept of service-centric computinghas led to the germination of the concept of Web services.The Web services paradigm has caught the fancy of theresearch and development community Many computerscientists and researchers from IT companies as well asuniversities are working together to deﬁne concepts, plat-forms, and standards that will determine how Web ser-vices are created, deployed, registered, discovered, andcomposed as well as how Web services will interact witheach other

WEB SERVICES TODAY

Web services are appearing on the Internet in theform of e-business sites and portal sites For example,

Trang 7

W EB S ERVICES

756

priceline.com (http://www.priceline.com) and Expedia

com (http://www.expedia.com) act as a broker for airlines,

hotels, and car rental companies They offer through their

portal sites statically composed Web services that have

prenegotiated an understanding with certain airlines and

hotels These are mostly a business-to-consumer (B2C)

kind of Web services A large number of technologies

and platforms have appeared and been standardized so

as to enable the paradigm of Web services to support

business-to-business (B2B) and B2C scenarios alike in a

uniform manner These standards enable creation and

de-ployment, description, and discovery of Web services, as

well as communication amongst them We describe some

preeminent standards below

The Web Services Description Language (WSDL) is a

standard to describe service interfaces and publish them

together with services’ access points (i.e., bindings) and

supported interfaces Once described in WSDL, Web

ser-vices can be registered and discovered using the

Univer-sal Description, Discovery, and Integration (UDDI)

Af-ter having discovered its partners, Web services use the

Simple Object Access Protocol (SOAP), which is in fact

an incarnation of the Remote Procedure Call (RPC) in

XML, over the HyperText Transfer Protocol (HTTP) to

ex-change XML messages and invoke the partners’ services

Though most services are implemented using

platform-independent languages such as Java and C#, development

and deployment platforms are also being standardized;

J2EE and NET are two well known ones Web services

and their users often expect different levels of security

depending on their security requirements and

assump-tion The primary means for enforcing security are

dig-ital signature and strong encryption using the Public

Key Infrastructure (PKI) SAML, XKMS, and XACML are

some of recently proposed security standards Also, many

secure payment mechanisms have been deﬁned (See

Figure 1)

Web Services Description

In traditional distributed software architectures,

devel-opers use an interface deﬁnition language (IDL) to

de-ﬁne component interfaces A component interface

typi-cally describes the operations the component supports by

specifying their inputs and expected outputs This enables

developers to decouple interfaces from actual

implemen-tations As Web services are envisaged as software

acces-sible through the Web by other Web services and users,

Figure 1: Web services.

Web services need to be described so that their interfacesare decoupled from their implementations WSDL serves

as an IDL for Web services

WSDL enables description of Web services dently of the message formats and network protocolsused For example, in WSDL a service is described as a set

indepen-of endpoints An endpoint is in turn a set indepen-of operations

An operation is deﬁned in terms of messages received orsent out by the Web service:

Message—An abstract deﬁnition of data being cated consisting of message parts

communi-Operation—An abstract deﬁnition of an action supported

by the service Operations are of the following types:one-way, request–response, solicit–response, and noti-ﬁcation

Port type—An abstract set of operations supported by one

Service—A collection of related endpoints

As the implementation of the service changes or evolvesover time, the WSDL deﬁnitions must be continuouslyupdated and versioning the descriptions done

Web Services Discovery

When navigating the Web for information, we use keywords to ﬁnd Web sites of interest through search engines.Often times, useful links in search results are mixed with

a lot of unnecessary ones that need to be sifted through.Similarly, Web services need to discover compatibleWeb services before they undertake business with them.The need for efﬁcient service discovery necessitates somesort of Web services clearing house with which Webservices register themselves UDDI (http://www.uddi.org)supported by Ariba, IBM, Microsoft, and HP, is an ini-tiative to build such a Web service repository; it is nowunder the auspice of OASIS (http://www.oasis-open.org).These companies maintain public Web-based registries(operator sites) consistent with each other that makeavailable information about businesses and their techni-cal interfaces and application program interfaces (APIs)

A core component of the UDDI technology is tion, an XML document deﬁning a business and the Webservices it provides There are three parts to the regis-

registra-tration, namely a white page for name, address, contact information, and other identiﬁers; a yellow page for clas-

siﬁcation of a business under standard taxonomies; and

a green page that contains technical information about

the Web services being described UDDI also lists a set ofAPIs for publication and inquiry The inquiry APIs are forbrowsing information in a repository (e.g., ﬁnd business,get businessDetail) The publication APIs are for businessentities to put their information on a repository

E-marketplaces have been an important development

in the business transaction arena on the Internet Theyare a virtual meeting place for market participants(i.e., Web services) In addition to the basic registration

Trang 8

and discovery, e-marketplaces offer their participants a

number of value-added services, including the following:

Enabling inter-Web service interaction after the discovery

(the actual interaction may happen with or without thedirect participation of the e-marketplace);

Enabling supply and demand mechanisms through

tradi-tional catalogue purchasing and request for purchase(RFP), or through more dynamic auctions and ex-changes;

Enabling supply-chain management through

collabora-tive planning and inventory handling; andOther value-added services, such as rating, secured pay-

ment, financial handling, certification services, and tification services

no-Thus, e-marketplaces can be developed as an entity that

uses public UDDI registries The e-marketplaces are

cat-egorized as vertical and horizontal depending on their

target market The vertical e-marketplaces, such as

Ver-ticalNet, GlobalNetXChange, and Retailer Market

Ex-change, target a speciﬁc industry sector where

partici-pants perform B2B transactions In particular, Chemdex,

E-Steel, DirectAg.com, and many more have been

success-ful in their respective markets By contrast, horizontal

ex-changes, such as eBay, are directed at a broad range of

clients and businesses

Web Services Orchestration

By specifying a set of operations in their WSDL document,

Web services make visible to the external world a certain

subset of internal business processes and activities

There-fore, the internal business processes must be deﬁned and

some of their activities linked to the operations before

publication of the document This in turn requires

mod-eling a Web service’s back-end business processes as well

as interactions between them On the other hand, Web

ser-vices are developed to serve and utilize other Web serser-vices

This kind of interaction usually takes a form of a sequence

of message exchanges and operation executions, termed

conversation Although conversations are described

inde-pendently of the internal ﬂows of the Web services, they

result in executions of a set of backend processes A Web

service and its ensuing internal processes together form

what is called a global process.

Intra-Web Service Modeling and Interaction

The Web Services Flow Language (WSFL) (Leymann,

2001), the Web Services Conversation Language (WSFL)

(W3C, 2002), the Web Service Choreography Interface

(WSCI) (BEA, 2002) and XLANG (Thatte, 2001) are some

of many business process speciﬁcation languages for Web

services

WSFL introduces the notion of activities and ﬂowswhich are useful for describing both local business pro-

cess ﬂows and global message ﬂows between multiple Web

services WSFL models business processes as a set of

ac-tivities and links An activity is a unit of useful work while

a link connects two activities A link can be a control link

where a decision of what activity to follow is made, or a

data link specifying that a certain datum ﬂows from an

activity to another These activities may be made visiblethrough one or more operations grouped as endpoints As

in WSDL, a set of endpoints defines a service WSFL fines global message flows in a similar way A global flowconsists of plug links that link up operations of two ser-vice providers Complex services involving more than twoservice providers are created by recursively defining pluglinks

de-XLANG developed by Microsoft extends the XMLSchema Definition Language (XSDL) to provide a mecha-nism for process definition and global flow coordination.The extension elements describe the behavioral aspects

of a service A behavior may span multiple operations.Action is an atomic component of a behavior deﬁnition

An action element can be an operation, a delay element,

or a raise element A delay element can be of type

de-layFor or delayUntil dede-layFor and delayUntil introducedelays in execution for a process to wait for something

to happen (for example, a timeout) and to wait till anabsolute date-time has been reached, respectively Raiseelements are used to specify exception handling Excep-tions are handled by invoking the corresponding handlerregistered with a raise deﬁnition Finally, processes com-bine actions in different ways: some of them are sequence,switch, while, all, pick, and empty

Inter-Web Service Modeling and Interaction

Web services must negotiate and agree on a protocol inorder to engage in a business transaction on the Web.X-EDI, ebXML, BTP, TPA-ML, cXML, and CBL have beenproposed as an inter-Web service interaction protocol Wefocus on ebXML as it is by far the most successful one.(See Figure 2.)

In ebXML (http://www.ebxml.org/) parties to engage in

a transaction have Collaboration Protocol Proﬁles (CPP’s)that they register at ebXML registries A CPP contains thefollowing:

Process Speciﬁcation Layer—Details the business tions that form the collaboration It also speciﬁes theorder of business transactions

transac-Delivery Channels—Describes a party’s message receivingand sending characteristics A speciﬁcation can con-tain more than one delivery channel

A

C B

X

Y

Z

Pt o2 o3

Pt o5

o7

C B

Trang 9

W EB S ERVICES

758

Document Exchange Layer—Deals with processing of the

business documents like digital signatures, encryption,

and reliable delivery

Transport Layer—Identiﬁes the transport protocols to be

used with the endpoint addresses, along with other

properties of the transport layer The transport

proto-cols could be SMTP, HTTP, and FTP

When a party discovers another party’s CPP they

ne-gotiate certain agreement and form a Collaboration

Pro-tocol Agreement (CPA) The intent of the CPA is not to

expose the business process internals of the parties but

to make visible only the processes that are involved in

interactions between the parties Message exchange

be-tween the parties can be facilitated with the ebXML

Mes-saging Service (ebMS) A CPA and the business process

speciﬁcation document it references deﬁne a conversation

between parties A typical conversation consists of

mul-tiple business transactions which in turn may involve a

sequence of message exchanges for requests and replies

Although a CPA may refer to multiple business process

speciﬁcation documents, any conversation is allowed to

involve only a single process speciﬁcation document

Con-ceptually, the B2B servers of parties involved are

respon-sible for managing CPAs and for keeping track of the

conversations They also interface the operations deﬁned

in a CPA with the corresponding internal business

pro-cesses

Web Services Platforms

Web services platforms are the technologies, means, and

methods available to build and operate Web services

Plat-forms have been developed and changed over the course

of time A classiﬁcation into four generations of platform

technology should help to structure the space:

First Generation: HTML and CGI—Characterized by Web

servers, static HTML pages, HTML FORMS for simple

dialogs, and the Common Gateway Interface (CGI) to

connect Web servers to application programs, mostly

Perl or Shell scripts (See Figure 3.)

Second Generation: Java—Server-side dynamic

genera-tion of HTML pages and user session support; the Java

servlet interface became popular for connecting to

ap-plication programs

Third Generation: Application server as Richer

develop-ment and run-time environdevelop-ments—J2EE as foundation

for application servers that later evolved towards the

fourth generation

Service

Service B

CPP

CPA

ebXML registry

Figure 3: ebXML service-to-service interaction.

front-end web server app server back-end

DB

Figure 4: Basic four-tier architecture for Web services.

Fourth Generation: Web services—Characterized by theintroduction of XML and WSDL interfaces for Webservices with SOAP-based messaging A global serviceinfrastructure for service registration and discoveryemerged: UDDI Dynamic Web services aggregation—Characterized by ﬂow systems, business negotiations,agent technology, etc

Technically, Web services have been built according to a

pattern of an n-tier architecture that consists of a

front-end tier, ﬁrewall (FW), load balancer (LB), a Web-servertier (WS), an application (server) (AS) tier, and a back-end tier for persistent data, or the database tier (DB) (SeeFigure 4.)

First Generation: HTML and CGI

The emergence of the World Wide Web facilitated theeasy access and decent appearance of linked HTML mark-

up pages in a user’s browser In the early days, it wasmostly static HTML content Passive information servicesthat provided users with the only capability of naviga-ting though static pages could be built However, HTMLsupported from the very beginning FORMS that allowedusers to enter text or select from multiple-choice menus.FORMS were treated specially by Web servers They werepassed onto CGI, behind which small applications, mostlyPerl or Shell scripts, could read the user’s input, performrespective actions, and return a HTML page that couldthen be displayed in the user’s browser This primitivemechanism enabled a ﬁrst generation of services on theWeb beyond pure navigation through static contents

Second Generation: Java

With the growth of the Web and the desire for richer vices such as online shopping and booking, the initialmeans to build Web services quickly became too primi-tive Java applets also brought graphical interactiveness tothe browser side Java appeared as the language of choicefor Web services Servlets provided a better interface be-tween the Web server and the application Technology tosupport dynamic generation of HTML pages at the serverside was introduced: JSP (Java Server Pages) by Sun Mi-crosystems, ASP (Active Server Pages) by Microsoft, orPHP pages in the Linux world enabled separation of pre-sentation, the appearance of pages in browsers, from con-tent data Templates and content were then merged onthe fly at the server in order to generate the final page re-turned to the browser Since user identification was crit-ical for business services, user log-in and user sessionswere introduced Applications were becoming more com-plex, and it turned out that there was a significant overlap

ser-in common functions needed for many services such assession support, connectivity to persistent databases, andsecurity functions

Trang 10

Figure 5: The J2EE platform.

Third Generation: Application Server

The observation that many functions were shared and

common among Web services drove the development

toward richer development environments based on the

Java language and Java libraries A cornerstone of these

environments became J2EE (Java 2 Platform, Enterprise

Edition), which is a Java platform designed for

enterprise-scale computing Sun Microsystems (together with

in-dustry partners such as IBM) designed J2EE (Figure 5)

to simplify application development for Web services by

decreasing the need for programming through reusable

modular components and by providing standard

func-tions such as session support and database

connecti-vity

J2EE primarily manifests in a set of libraries used byapplication programs performing the various functions

Web service developers still had to assemble all the pieces,

link them together, connect them to the Web server, and

manage the various conﬁgurations This led to the

emer-gence of software packages that could be deployed

eas-ier on a variety of machines These packages later

be-came application servers They signiﬁcantly reduced the

amount of conﬁguration work during service deployment

such that service developers could spend more time on

business logic and the actual function of the service Most

application server are based on J2EE technology

Exam-ples are IBM’s WebSphere suite, BEA’s WebLogic

environ-ment, the Sun ONE Application Framework, and Oracle’s

9i application server (See Figure 5.)

Fourth Generation: Web Services

Prior generations of Web services mostly focused on

end-users, people accessing services from Web browsers

How-ever, accessing services from services other than browsers

turned out to be difﬁcult This circumstance has prevented

the occurrence of Web service aggregation for a long time

Web service aggregation meant that users would only have

to contact one Web service, and this service then would

resolve the user’s requests with further requests to other

Web services

HTML is a language deﬁned for rendering and senting content in Web browsers It does not allow per se

pre-separating content from presentation information With

the advent of XML, XML became the language of choice

for Web services for providing interfaces that could not

only be accessed by users through Web browsers but also

by other services XML is now pervasively being used

in Web services messaging (mainly using SOAP) and forWeb service interface descriptions (WSDL) In regard toplatforms, XML enhancements were added to J2EE andapplication servers The introduction of XML is the majordifferentiator between Web services platforms of the thirdand the fourth generation in this classiﬁcation

A major step toward the service-to-service integrationwas the introduction of the UDDI service (see the abovesection Web Services Discovery)

Three major platforms for further Web services teraction and integration are: Sun Microsystems’ SunONE (Open Net Environment), IBM WebSphere, and Mi-crosoft’s NET

in-Sun ONE—in-Sun’s standards-based software architectureand platform for building and deploying services ondemand Sun ONE’s architecture is built around exis-ting business assets: Data, applications, reports, andtransactions, referred to as the DART model Majorstandards are supported: XML, SOAP, J2EE, UDDI,LDAP, and ebXML The architecture is composed ofseveral product lines: the iPlanet Application Frame-work (JATO), Sun’s J2EE application framework forenterprise Web services development, application ser-ver, portal server, integration server, directory server,e-commerce components, the Solaris Operating Envi-ronment, and development tools

IBM WebSphere—IBM’s platform to build, deploy, andintegrate your e-business, including components such

as foundation and tools, reach and user experience,business integration, and transaction servers andtools

Microsoft NET—Microsoft’s NET platform for ing lead technology for future distributed applicationsinherently seen as Web services With Microsoft NET,Web services’ application code is built in discrete units,XML Web services, which handle a speciﬁed set oftasks Because standard interfaces based on XML sim-plify communication among software, XML Web ser-vices can be linked together into highly speciﬁc applica-tions and experiences The vision is that the best XMLWeb services from any provider around the globe can

provid-be used to create a needed solution quickly and easily

Trang 11

W EB S ERVICES

760

Microsoft will provide a core set of XML Web services,

called Microsoft NET My Services, to provide

func-tions such as user identiﬁcation and calendar access

Security and Web Services

Due to their public nature, security is vital for Web

ser-vices Security attacks can be classiﬁed as threats of

infor-mation disclosure, unauthorized alteration of data,

de-nial of use, misuse or abuse of services, and, more rarely

considered, repudiation of access Since Web services

link networks together with businesses, further attacks,

such as masquerading, stealing or duplicating identity

and conducting business under false identity, or accessing

or transferring funds from or to unauthorized accounts,

need to be considered

Security is vital for establishing the legal basis for

businesses done over the Web Identiﬁcation and

authen-tication of business partners are the basic security

re-quirements Others include integrity and authenticity of

electronic documents Electronic contracts must have the

same binding legal status as conventional contracts

Re-fusal and repudiation of electronic contracts must be

provable in order to be legally valid Finally, payment and

transferring funds between accounts must be safe and

se-cure

Security architectures in networks are typically

com-posed of several layers:

Secure data communication—IPsec (Internet Protocol

Security), SSL (Secure Socket Layer), TLS (Transport

Layer Security);

Secured networks—VPNs (Virtual Private Networks);

Authenticity of electronic documents and issuing

individuals—digital signatures;

Secure and authenticated access—digital certiﬁcates;

Secure authentication and certiﬁcation—PKI (Public Key

Infrastructure); and

Single sign-on and digital passports

Single Sign-On and Digital Passports

Digital passport emerged from the desire to provide an

in-dividual’s identity information from a trusted and secure

centralized place rather then repeatedly establishing this

information with each collaborating partner and

main-taining separate access credentials for each pair of

collab-orations Individuals only need one such credential, the

passport, in order to provide collaborating partners with

certain parts of an individual’s identity information This

consolidates the need for maintaining separate identities

with different partners into a single identiﬁcation

mech-anism Digital passports provide an authenticated access

to a centralized place where individuals have registered

their identity information such as phone numbers, social

security numbers, addresses, credit records, and payment

information Participating individuals, both people and

businesses, will access the same authenticated

informa-tion assuming trust to the authority providing the

pass-port service Two initiatives have emerged: Microsoft’s

.NET Passport and the Liberty Alliance Project, initiated

by Sun Microsystems

Microsoft NET Passport (Microsoft NET, 2002) is asingle sign-on mechanism for users on the Internet In-stead of creating separate accounts and passwords withevery e-commerce site, users only need to authenticatewith a single Passport server Then, through a series ofauthentications and encrypted cookie certiﬁcates, the user

is able to purchase items at any participating e-commercesite without verifying the user’s identity again .NET Pass-port is an online service that enables use of an e-mail ad-dress and a single (Passport server) password to securelysign in to any NET Passport participating Web site orservice It allows users to easily move among participat-ing sites without the need to verify their identity again.The Microsoft NET Passport had initially been plannedfor signing into Microsoft’s own services Expanding it to-ward broader use in the Web has been seen as critical.This concern gave reason for the Liberty Alliance Projectinitiative that is now widely supported in industry andpublic

The Liberty Alliance Project (Liberty Alliance Project,2002) is an organization being formed to create an open,federated, single sign-on identity solution for the digi-tal economy via any device connected to the Internet.Membership is open to all commercial and noncommer-cial organizations The Alliance has three main objec-tives:

1 To enable consumers and businesses to maintain sonal information securely

per-2 To provide a universal, open standard for single sign-onwith decentralized authentication and open authoriza-tion from multiple providers

3 To provide an open standard for network identity ning all network-connected devices

span-With the emergence of Web services, speciﬁc rity technology is emerging Two major security techno-logy classes are Java-based security technology and XML-based security technology

secu-Both classes basically provide mappings of securitytechnologies, such as authentication and authorization,encryption, and signatures, into respective environments

Java-Based Security Technology for Web Services

Java-based security technology is primarily availablethrough the Java 2 SDK and J2EE environments in theform of sets of libraries:

Encryption—JSSE (Java Secure Socket Extension); theJCE (Java Cryptography Extension) provides a frame-work and implementations for encryption, key gener-ation and key agreement, and Message AuthenticationCode (MAC) algorithms Support for encryption in-cludes symmetric, asymmetric, block, and stream ci-phers The software also supports secure streams andsealed objects

Secure messaging—Java GSS-API is used for securelyexchanging messages between communicating appli-cations The Java GSS-API contains the Java bindingsfor the Generic Security Services Application ProgramInterface (GSS-API) deﬁned in RFC 2853 GSS-API

Trang 12

offers application programmers uniform access tosecurity services atop a variety of underlying securitymechanisms, including Kerberos

Authentication and Authorization—JAAS (Java

Authenti-cation and Authorization Service) for authentiAuthenti-cation

of users, to reliably and securely determine who is rently executing Java code, and for authorization ofusers to ensure they have the access rights (permis-sions) required to do security-sensitive operations

cur-Certiﬁcation—Java Certiﬁcation Path API

X.509 Certiﬁcates and Certiﬁcate Revocation Lists (CRLs)

and Security Managers

These libraries are available for use when Web services

are built using Java They are usually used when building

individual Web services with application servers

For Web services interaction, XML technology inates the tied binding to Java Consequently, a similar

elim-set of XML-based security technologies enabling

cross-service interactions is emerging

XML-Based Security Technology for Web Services

The Organization for the Advancement of Structured

Information Standards (OASIS) merges security into

Web services at a higher level than the common

Inter-net security mechanisms and practices described above

Proposals are primarily directed toward providing XML

speciﬁcations for documents and protocols suitable for

cross-organizational Web services interactions

XML-based security technology can be classiﬁed into the

fol-lowing:

XML Document-Level Security—encryption and digitally

signing XML documents;

Protocol-Level Security for XML Document Exchanges—

exchanging XML documents for authentication andauthorization of peers; and

XML-Based Security Frameworks—infrastructures for

establishing secure relationships among parties

XML Document-Level Security: Encryption and

Signature. The (preliminary) XML encryption

speciﬁ-cation (Reagle, 2000) details requirements on how to

digitally encrypt a Web resource in general, and an XML

document in particular XML encryption can be applied

to a part of or complete XML document The granularity

of encryption can be reduced to an element, attributes,

or text content Encryption can be recursive The

speciﬁ-cation does not address conﬁdence or trust relationships

and key establishment The speciﬁcation addresses both

key-encrypting-keys and data keys The speciﬁcation will

not address the expression of access control policies

asso-ciated with portions of the XML document This will be

addressed by XACML

XML signature deﬁnes the XML schema and ing rules for creating and representing digital signatures

process-in any digital content (data object), process-includprocess-ing XML An

XML signature may be applied to the content of one

or more documents Enveloped or enveloping signatures

are over data within the same XML document as the

signature; detached signatures are over data external tothe signature element More specifically, this specificationdefines an XML signature element type and an XML sig-nature application; conformance requirements for eachare specified by way of schema definitions and prose re-spectively This specification also includes other usefultypes that identify methods for referencing collections ofresources, algorithms, and keying and management infor-mation

The XML Signature (Bartel, Boyer, Fox, LaMacchia,

& Simon, 2002) is a method of associating a key withreferenced data (octets); it does not normatively specifyhow keys are associated with persons or institutions, northe meaning of the data being referenced and signed.Consequently, while this specification is an importantcomponent of secure XML applications, it itself is not suf-ficient to address all application security/trust concerns,particularly with respect to using signed XML (or otherdata formats) as a basis of human-to-human communi-cation and agreement Such an application must specifyadditional key, algorithm, processing, and rendering re-quirements The SOAP Digital Signature Extensions de-fines how specifically SOAP messages can be digitallysigned

Protocol-Level Security for XML Document Exchanges. Protocol-level security deﬁnes documentexchanges with the purpose of establishing secure rela-tionships among parties, typically providing well-deﬁnedinterfaces and XML bindings to an existing public key in-frastructure Protocol-level security can be built upon thedocument-level security

The XML Key Management Specification (Ford et al.,2001) defines protocols for validating and registering pub-lic keys, suitable for use in conjunction with the pro-posed standard for XML signature developed by the WorldWide Web Consortium (W3C) and the Internet Engineer-ing Task Force (IETF) and an anticipated companion stan-dard for XML encryption The XML Key ManagementSpecification (XKMS) comprises two parts: the XML KeyInformation Service Specification (X-KISS) and the XMLKey Registration Service Specification (X-KRSS).The X-KISS specification defines a protocol for a trustservice that resolves public key information contained inXML-SIG document elements The X-KISS protocol al-lows a client of such a service to delegate part or all of thetasks required to process <ds:KeyInfo> elements embed-ded in a document A key objective of the protocol design

is to minimize the complexity of application tations by allowing them to become clients and therebyshielded from the complexity and syntax of the underlyingPublic Key Infrastructure (OASIS PKI Member Section,2002) used to establish trust relationships-based speciﬁ-cations such as X.509/PKIX, or SPKI (Simple Public KeyInfrastructure, 1999)

implemen-The X-KRSS speciﬁcation deﬁnes a protocol for a webservice that accepts registration of public key information.Once registered, the public key may be used in conjunc-tion with other web services including X-KISS

XML-Based Security Frameworks. XML-based curity frameworks go one step further than the above

Trang 13

se-Sahai WL040/Bidgolio-Vol I WL040-Sample.cls July 16, 2003 18:35 Char Count= 0

W EB S ERVICES

762

The Security Assertion Markup Language (SAML),

de-veloped under the guidance of OASIS (OASIS, 2002), is an

XML-based framework for exchanging security

informa-tion with established, SAML-compliant security services

This security information is expressed in the form of

as-sertions about subjects, where a subject is an entity (either

human or program) that has an identity in some security

domain A typical example of a subject is a person,

iden-tiﬁed by his or her e-mail address in a particular Internet

DNS domain

Assertions can convey information about

authentica-tion acts performed by subjects, attributes of subjects,

and authorization decisions about whether subjects are

allowed to access certain resources Assertions are

repre-sented as XML constructs and have a nested structure,

whereby a single assertion might contain several

differ-ent internal statemdiffer-ents about authdiffer-entication,

authoriza-tion, and attributes Assertions containing authentication

statements merely describe acts of authentication that

happened previously

Assertions are issued by SAML authorities, namely,

au-thentication authorities, attribute authorities, and policy

decision points SAML deﬁnes a protocol by which

rely-ing parties can request assertions from SAML

authori-ties and get a response from them This protocol,

consist-ing of XML-based request-and-response message formats,

can be bound to many different underlying

communica-tions and transport protocols Currently it deﬁnes only

one binding, namely SOAP over HTTP

SAML authorities can use various sources of

informa-tion, such as external policy stores and assertions that

were received as input in requests, in creating their

re-sponses Thus, while clients always consume assertions,

SAML authorities can be both producers and consumers

of assertions

Payment Systems for Web Services

Effective payment systems are a prerequisite for business

with Web services This section introduces and classiﬁes

different approaches for payment systems that have been

developed over the passed years However, payments in

the Internet are mostly conducted through the existing

payment infrastructure that was developed before the

In-ternet became pervasive End-consumer retail business on

the Internet primarily relies on credit card transactions

Other traditional payment methods are offered as well:

personal checks, money orders, or invoice billing In the

business-to-business segment, traditional invoice billing

is still the major payment method An overview is given

in (Weber, 1998) W3C has adopted payment standards

(Micropayment Overview, 2002)

Payments by Credit Cards

The reason why credit card payments are well accepted is

that credit card providers act as intermediaries between

payers and recipients of payments (payees) They do also

guarantee payments up to a limit (important to the payee),

and they carry the risk of misuse All parties must

regis-ter accounts before transfers can be conducted Another

important service is the veriﬁcation of creditability of a

person or a business before opening an account

SET—The Secure Electronic Transaction Standard

SET (Secure Electronic Transaction, 2002) is an opentechnical standard for the commerce industry initiallydeveloped by two major credit card providers, Visa andMasterCard, as a way to facilitate secure payment cardtransactions over the Internet Digital certificates (DigitalCertificates, 1988) create a trust chain throughout thetransaction, verifying cardholders’ and merchants’ iden-tity SET is a system for ensuring the security of finan-cial transactions of credit card providers or bank acco-unts Its main objective is to provide a higher securitystandard for credit card payments on the Internet A ma-jor enhancement compared to traditional credit card pay-ments is that neither credit card credentials nor payers’identity are revealed to merchants With SET, a user isgiven an electronic wallet (digital certificate) A transac-tion is conducted and verified using a combination of digi-tal certificates and digital signatures among the purchaser,

a merchant, and the purchaser’s bank in a way that sures privacy and conﬁdentiality

en-Not all payments required by Web services can be ducted through credit card transactions First, credit cardtransactions are typically directed from an end-customer,

con-a person, to con-a business thcon-at ccon-an receive such pcon-ayments.Second, the amounts transferred through a credit cardtransaction are limited to a range between currency equiv-alents of > $0.10 up to several thousand dollars depending

on an individual’s credit limits Micropayments <$0.10, aswell as macropayments> $10,000, are typically not pro-vided The lower payment bound is also caused by the costper transaction model credit card providers use Third,payments among persons, as for instance required forauctions among people or for buying and selling usedgoods, cannot be conducted through credit card accounts.Traditional payment methods are used here: personalchecks, money orders, or cash settlement Fourth, onlyindividuals with registered accounts can participate incredit card payments Individuals that do not qualify areexcluded This restriction is also a major barrier for Webservice business in developing countries

Micropayments

The purpose of micropayments is primarily for per-use” models where the usage is measured and im-mediately charged to customers in very small amounts.Transaction costs for micropayment systems need to besigniﬁcantly lower, and the number of transactions may

“pay-be signiﬁcantly higher than that of credit card payments.Accurate, ﬁne-grained charging is enabled These are thetwo major differentiators of micropayment systems W3Cproposes the Common Markup for Micropayment “per-fee-links.”

Micropayments involve a buyer or customer, a vendor

or merchant, and potentially one or more additional ties that keep accounts in order to aggregate micro pay-ments for ﬁnal charge These mediators are called brokers(in Millicent), billing servers (in IBM MicroPayments),

par-or intermediaries (in France Telecom Micropayments), toname a few

Millicent. One micropayment system is Millicent(Glassman, 2000) The MilliCent Microcommerce

Trang 14

Network provides new pay-per-click/earn-per-click

func-tionality for Internet users It allows buying and selling

digital products costing from 1/10th of a cent to up to

$10.00 or more MilliCent can be used by Web services to

build any number of parallel revenue streams through the

simultaneous use of pay-per-click purchases,

subscrip-tions, and advertising It can also be used to make direct

monetary payments to users MilliCent is optimized for

buying and selling digital products or services over the

Internet such as articles, newsletters, real-time data,

strea-ming audio, electronic postage, video streams, maps,

ﬁnancial data, multimedia objects, interactive games,

software, and hyperlinks to other sites

NetBill. NetBill is a Carnegie Mellon University

Inter-net billing server project, which is used as a payment

method for buying information goods and services via the

Internet It aims at secure payment for and delivery of

information goods, e.g., library services, journal articles,

and CPU cycles The NetBill system charges for

transtions and requires customers to have a prepaid NetBill

ac-count from which all payments are deducted The NetBill

payment system uses both symmetric key and public key

cryptography It relies on Kerberos for authentication An

account server, called NetBill server, maintains accounts

for both customers and merchants NetBill acts as an

ag-gregator to combine many small transactions into larger

conventional transactions, thus amortizing conventional

overhead fees Customers and merchants have to trust the

NetBill server

Digital Money and Digital Coins

In contrast to account-based payment systems, such as

credit card-based systems, where amounts are

trans-ferred between accounts inside or between credit card

or bank providers, digital money represents a value

amount ﬂowing from a payer to a payee across the

network Establishing accounts with providers before

ser-vices can actually be used is unnecessary Advantages

are the same as for cash money: no mutual accounts

need to be established before a payment can be

con-ducted No mutual authentication is needed for

improv-ing convenience for both parties In addition, as with

cash money, the payer does not need to reveal any

identity credentials to the payee or someone else

Pay-ments are anonymous and nontraceable A major

hur-dle for this approach is the prevention of duplication

and forging of digital money since no physical security

marks such as watermarks can be applied to digitized

bit strings

The basic idea behind digital money is that a sumer purchases “digital coins” from an issuer using a

con-regular payment method such as a credit card The issuer

generates an account for that customer and deposits the

amount into it It then hands out a set of digital coins to

the customer that he or she can use for payments For a

payment, the customer transfers coins to the merchant or

service provider The provider then transfers coins to the

issuer and deposits them into his account The merchant,

however, may also use these coins to pay its suppliers

Dig-ital coins will thus ﬂow among participants similarly like

cash money ﬂows among people

The following requirements need to be met by digitalmoney systems:

digital money must be protected from duplication or ing; and

forg-digital money should neither contain nor reveal identitycredentials of any involved party in order to be anony-mous

The ﬁrst requirement is achieved by not actually senting an amount by a digital coin, but rather a reference

repre-to an allocated amount in the possessor’s account withthe issuer When digital coins are copied, the reference

is copied, not the amount itself However, the ﬁrst vidual redeeming a coin with the issuer will receive theamount Identity at redemption cannot be veriﬁed sincedigital coins do not carry identifying credentials of thepossessor The only term the issuer can verify is whether

indi-or not a coin has already been redeemed By thus, theft ofdigital money is possible, and parties have an interest inkeeping their coins protected

Achieving complete anonymity between an issuer andsubsequent receivers of digital money is a key characteris-tic of digital money It is basically achieved by blinded sig-natures (Chaum, 1985) that guarantee to uniquely assigncoins with allocated amounts within the issuer’s accountsystem and without revealing any identiﬁcation informa-tion of the holder of that account

E-cash. E-cash (CrytoLogic Ecash FAQ, 2002) stands for

“electronic cash,” a system developed by DigiCash thatunderwent ﬁeld tests in the late 1990s E-cash is a legalform of computer-generated currency This currency can

be securely purchased with conventional means: creditcards, checks, money orders, or wire transfers

MicroMint. MicroMint is a proposal by Rivest andShamir about coins that can only efﬁciently be produced

in very large quantities and are hard to produce in smallquantities The validity of a coin is easily checked Mi-croMint is optimized for unrelated low-value payments

It uses no public key operations However, the scheme isvery complex and would require a lot of initial and opera-tional efforts Therefore, it is unlikely that it ever will gainany practical importance

A broker will issue new coins at the beginning of aperiod and will revoke those of the prior period Coinsconsist of multiple hash collisions, i.e., different valuesthat all hash to the same value The broker mints coins bycomputing such hash collisions For that process manycomputations are required, but more and more hash col-lisions are detected with continued computation The bro-ker sells these MicroMint coins in batches to customers.Unused coins can be returned to the broker at the end of aperiod, e.g., a month Customers render MicroMint coins

as payment to merchants

THE FUTURE OF WEB SERVICES

In future we will see the unleashing of a Web servicesphenomenon This will involve the fulﬁllment of dynamicWeb service composition and orchestration vision, the ap-pearance of personalized Web services, concepts of Web

Trang 15

W EB S ERVICES

764

service management, and the development of Web service

infrastructure as a reusable, reconﬁgurable, self-healing,

self-managing, large-scale system

Dynamic Web Services Composition

and Orchestration

The vision of Web services intelligently interacting with

one another and performing useful tasks automatically

and seamlessly remains to become reality Major

mile-stones have been achieved: XML as a syntactic framework

and data representation language for Web services

inter-action; the Web infrastructure itself providing ubiquitous

access to Web services; the emergence of global

registra-tion and discovery services; and the technology to

sup-port the creation and maintenance of Web services, just

to name a few However, major pieces such as the

forma-lization and description of service semantic are yet to be

developed The effort of creating a semantic Web

(Se-mantic Web, 2001) is an extension of the current Web

in which information is given well-deﬁned meaning,

bet-ter enabling compubet-ters and people to work in

coopera-tion Ontologies deﬁne the structure, relationships, and

meaning of terms appearing in service descriptions The

semantic Web vision is that these ontologies can be

reg-istered, discovered, and used for reasoning about Web

service selection before undertaking business Languages

like DAML+OIL (DAML, 2001) have been developed in

this context

In addition, sending a document or invoking a method

and getting a reply are the basic communication

prim-itives However, complex interactions between Web

ser-vices will involve multiple steps of communication that

are related to each other A conversation deﬁnition is a

sequencing of document exchanges (method invocations

in the network object model) that together accomplish

some business functionality In addition to agreeing upon

vocabularies and document formats, conversational Web

services also agree upon conversation deﬁnitions before

communicating with each other A conversation

defini-tion consists of descripdefini-tions of interacdefini-tions and

transi-tions Interactions deﬁne the atomic units of information

interchange between Web services Essentially, each

ser-vice describes each interaction in terms of the documents

that it will accept as input or will produce as output The

interactions are the building blocks of the conversation

deﬁnition Transitions specify the ordering amongst the

interactions Web services need to introspect other Web

services and obtain each other’s descriptions before they

start communicating and collaborating (Banerji et al.,

2002)

RosettaNet (RosettaNet, 2002) is a nonproﬁt

consor-tium of major information technology, electronic

com-ponents, and semiconductor manufacturing companies

working to create and implement industry-wide, open

e-business process standards, particularly targeting

busi-ness-to-business market places, workﬂow, and

supply-chain management solutions These standards form a

common e-business language, aligning processes between

supply-chain partners on a global basis Several examples

exist The centerpiece of the RosettaNet model is the

part-ner interface process (PIP) The PIP deﬁnes the activities,

decisions, and interactions that each e-business tradingparticipant is responsible for Although the RosettaNetmodel has been in development, it will be a while untilWeb services start using them to undertake business onthe Web

Once these hurdles are overcome, the basis and form for true Web services that will enable agent technolo-gies merging into Web services to provide the envisioneddynamic Web service aggregation on demand according

plat-to users’ speciﬁcations will emerge

Personalized Web Services

As Web service technology evolves, we anticipate thatthey will become increasingly sophisticated, and that thechallenges the Web service community will face will alsoevolve to meet their new capabilities One of the mostimportant of these challenges is the question of what itmeans to personalize Web services Personalization can

be achieved by using user proﬁles, i.e., monitoring userbehavior, devices, and context to customize Web services(Kuno & Sahai, 2002) for achieving metrics like quality ofexperience (QoE) (van Moorsel, 2001) This would involveproviding and meeting guarantees of service performance

on the user’s side Personalization could also result in thecreation of third-party rating agencies that will registeruser experiences, which could be informative for otherﬁrst-time users These rating mechanisms already exist in

an ad hoc manner, e.g., eBay and Amazon allow users torate sellers and commodities (books), respectively Salcen-tral.com and bizrate.com are third-party rating agenciesthat rate businesses These services could be also devel-oped as extended UDDI services These mechanisms willalso render Web services more “customer-friendly.”

End-to-End Web Service Interactions

Web services are federated in nature as they interactacross management domains and enterprise networks.Their implementations can be vastly different in nature.When two Web services connect to each other, they mustagree on a document exchange protocol and the appro-priate document formats (Austin, Barbir, & Garg 2002).From then on they can interoperate with each other,exchanging documents SOAP defines a common layerfor document exchange Services can define their ownservice-specific protocol on top of SOAP Often, these Webservice transactions will span multiple Web services A re-quest originating at a particular Web service can lead totransactions on a set of Web services For example, a pur-chase order transaction that begins when an employeeorders supplies and ends when he or she receives a con-firmation could result in 10 messages being exchangedbetween various services as shown in Figure 6

The exchange of messages between Web services could

be asynchronous Services sending a request messageneed not be blocked waiting for a response message Insome cases, all the participating services are like peers, inwhich case there is no notion of a request or a response.Some of the message ﬂow patterns that result from thisasynchrony are shown in Figure 7 The ﬁrst example inFigure 7 shows a single request resulting in multipleresponses The second example shows a broker-scenario,

Trang 16

7

6 9

8

1 purchase order

2 part of purchase order

3 the other part of the purchase order

10 purchase order confirmation

Figure 6: SOAP messages exchanged between Web services.

in which a request is sent to a broker but responses are

received directly from a set of suppliers

These Web services also interact with a complex web ofbusiness processes at their back-ends Some of these busi-

ness processes are exposed as Web service operations A

business process comprises a sequence of activities and

links as deﬁned by WSFL and XLANG These business

processes must be managed so as to manage Web

ser-vice interactions Management of Web serser-vices thus is

a challenging task because of their heterogeneity,

asyn-chrony, and federation Managing Web services involves

managing business transactions by correlation of

mes-sages across enterprises (Sahai, Machiraju, & Wurster,

2001) and managing the business processes

Also, in order to manage business on the Web, userswill need to specify, agree, and monitor service level agree-

ments (SLAs) with each other Thus, Web services will

invariably have a large number of SLAs As less human

intervention is more desirable, the large number of SLAs

would necessitate automating the process as much as

pos-sible (Sahai, Machiraju, Sayal, Jin, & Casati, 2002)

Web service to Web service interaction managementcan also be done through mediation (Machiraju, Sahai,

& van Moorsel, 2002) Web service networks’ vision is to

mediate Web service interactions, so as to make it secure,

manageable, and reliable Such networks enable

version-ing management, reliable messagversion-ing, and monitorversion-ing of

message ﬂows (e.g., Flamenco Networks, GrandCentral,

Transact Plus, Talking Blocks)

Future Web Services Infrastructures

Deployment and operational costs are determinants in

the balance sheets for Web service providers Web

of Web services (installation and conﬁguration of ware and content data), the virtual wiring of machinesinto application environments independently of the physi-cal wiring in a data center They allow rearrangements ofWeb services’ applications among machines, the dynamicsizing of service capacities according to ﬂuctuations in de-mands, and the isolation of service environments hosted

soft-in the same data center

HP’s Utility Data Center (HP Utility Data Center, 2001)

is such a platform The HP Utility Data Center with itsUtility Controller Software creates and runs virtual IT en-vironments as a highly automated service optimizing assetutilization and reducing stafﬁng loads Resource virtual-ization is invisible to applications, sitting underneath theabstractions of operating systems

Two types of resources are virtualized:

Virtualized network resources, permitting the rewiring ofservers and related assets to create entire virtual IT en-vironments; and

Virtualized storage resources, for secure, effective age partitioning, and with disk images containing per-sistent states of application environments such as ﬁlesystems, bootable operating system images, and appli-cation software

stor-Figure 8 shows the basic building blocks of such a utilitydata center with two fabrics for network virtualizationand storage virtualization

The storage virtualization fabric with the storage areanetwork attaches storage elements (disks) to processingelements (machines) The network virtualization fabricthen allows linking processing elements together in a vir-tual LAN

Two major beneﬁts for Web services management can

be achieved on top of the infrastructure:

Automated Web services deployment—By entirely taining persistent Web services’ states in the storagesystem and conducting programmatic control over

Trang 17

main-Sahai WL040/Bidgolio-Vol I WL040-Sample.cls July 16, 2003 18:35 Char Count= 0

Network Virtualization

Dynamic capacity sizing of Web services—By the ability to

automatically launch additional service instances

ab-sorbing additional load occurring to the service

Ser-vice instances are launched by ﬁrst allocating spare

machines from the pool maintained in the data center,

wiring them into the speciﬁc environment of the Web

service, attaching appropriate storage to those

ma-chines, and launching the applications obtained from

that storage Web server farms are a good example for

such a “breathing” (meaning dynamically adjustable)

conﬁguration (Andrzejak, Graupner, Kotov, & Trinks,

2002; Graupner, Kotov, & Trinks, 2002)

IBM’s Autonomic Computing vision is to provide for

self-managing systems The intent is to create systems

that respond to capacity demands and system glitches

without human intervention These systems intend to

be conﬁguring, healing, protecting, and

self-optimizing (IBM Autonomic Computing, 2002)

CONCLUSION

The Web services paradigm has evolved substantially

be-cause of concerted efforts by the software community The

genesis of Web services can be traced back to projects like

e-speak, Jini, and TSpaces Although progress has been

made in Web service standardization, the full potential of

Web services remains unrealized The future will see the

realization of Web services as a means of doing business

on the Web, the vision of dynamic composition of Web

services, personalized Web services, end-to-end

manage-ment of Web service interactions, and a dynamically

reusable service infrastructure that will adapt to

varia-tions in resource consumption

GLOSSARYBusiness process execution language for Web services (BPEL4WS) A standard business process descrip-tion language that combines features from WSFL andXLANG

Composition Creating composite Web services whenWeb services outsource their functionalities to otherWeb services

Conversation A set of message exchanges that can belogically grouped together

Description Describing Web services in terms of the erations and messages they support, so that they can

op-be registered and discovered at UDDI operator sites or

by using WS-Inspection

End-to-end management Protocol required to trackand manage Web service composition leading to atransaction being subdivided amongst multiple Webservices

Orchestration Web service to Web service interactionthat leads to the coupling of internal business pro-cesses

Personalization Personalizing or customizing Web vices to user/client proﬁles and requirements

ser-Platform One or more execution engines over which aWeb service implementation is executed

Service level agreement (SLA) An agreement thatspeciﬁes quality-of-service guarantees between parties

Simple object access protocol (SOAP) A standard formessaging between Web services

Web service conversation language (WSCL) A guage to describe Web service conversations

lan-Web services ﬂow language (WSFL) A language to scribe business processes

de-CROSS REFERENCES

See Client/Server Computing; Common Gateway Interface

(CGI) Scripts; Electronic Payment; Java; Perl; ization and Customization Technologies; Secure Electronic Transmissions (SET).

Personal-REFERENCES

Andrzejak, A., Graupner, S., Kotov, V., & Trinks, H (2002).Self-organizing control in planetary-scale computing

In IEEE International Symposium on Cluster

Comput-ing and the Grid (CCGrid), 2nd Workshop on based Cluster and Grid Computing (ACGC) New York:

Agent-IEEE

Austin, D., Barbir, A., & Garg, S (2002, 29 April) Web

services architecture requirements Retrieved November

2002 from 20020429

http://www.w3.org/TR/2002/WD-wsa-reqs-Banerji, A., Bartolini, C., Beringer, D., Chopella, V.,

Govin-darajan, K., Karp, A., et al (2002, March 14) WSCL

Web services conversation language Retrieved

Novem-ber 2002 from http://www.w3.org/TR/wscl10Bartel, M., Boyer, J., Fox, B., LaMacchia, B., & Si-

mon, E (2002, February 12) XML signature syntax

and processing Retrieved November 2002 from http://

www.w3.org/TR/2002/REC-xmldsig-core-20020212

Trang 18

BEA Systems, Intalio, SAP AG, and Sun Microsystems

(2002) Web Service Choreography Interface (WSCI)1.0 Speciﬁcation Retrieved November 2002 fromhttp://wwws.sun.com/software/xml/developers/wsci

Berners-Lee, T (1996, August) The World Wide Web: Past,

present and future Retrieved November 2002 from

http://www.w3.org/People/Berners-Lee/1996/ppf.htmlChaum, D (1985) Security without identiﬁcation: Trans-

action systems to make Big Brother obsolete

Commu-nications of the ACM, 28.

CryptoLogic Ecash FAQ (2002) Retrieved November 2002

from http://www.cryptologic.com/faq/faq-ecash.htmlDAML: The DARPA Agent Markup Language Home-

page (2001) Retrieved November 2002 from http://

www.daml.orgDigital Certiﬁcates, CCITT (1988) Recommendation

X.509: The Directory—Authentication Framework

ebXML: Enabling a global electronic market (2001)

Re-trieved November 2002 from http://www.ebxml.orgFord, W., Hallam-Baker, P., Fox, B., Dillaway, B., LaMac-

chia, B., Epstein, J., & Lapp, J (2001, March 30)

XML key management speciﬁcation (XKMS) Retrieved

November 2002 from http://www.w3.org/TR/xkmsGlassman S., Manasse, M., Abadi, M., Gauthier P., Sobal-

varo, P (2000) The Millicent Protocol for InexpensiveElectronic Commerce Retrieved November 2002 fromhttp://www.w3.org/Conferences/WWW4/Papers/246/

Graupner, S., Kotov, V., & Trinks, H (2002)

Resource-sharing and service deployment in virtual data centers

In IEEE Workshop on Resource Sharing in Massively

Distributed Systems (RESH’02) New York: IEEE.

Hallam-Baker, P., & Maler, E (Eds.) (2002, March 29)

Assertions and protocol for the OASIS Security sertion Markup Language Retrieved November 2002

As-from http://www.oasis-open.org/committees/security/

docs/draft-sstc-core-29.pdfKarp, A., Gupta, R., Rozas, G., Banerji, A (2001) The

Client Utility Architecture: The Precursor to E-speak,

HP Technical Report Retrieved November 2002from http://lib.hpl.hp.com/techpubs/2001/HPL-2001-136.html

HP Utility Data Center: Enabling the adaptive

infrastruc-ture (2002, November) Retrieved November 2002

from http://www.hp.com/go/hpudcKim, W., Graupner, S., & Sahai, A (2002, January 7–

10) A secure platform for peer-to-peer computing inthe Internet Paper presented at 35th Hawaii Inter-national Conference on System Science (HICSS-35),Hawaii

Kuno, H., & Sahai, A (2002) My agent wants to talk to your

service: Personalizing Web services through agents

Re-trieved November 2002 from http://www.hpl.hp.com/

techreports/2002/HPL-2002-114

IBM Autonomic Computing (n.d.) Retrieved from http://

www.research.ibm.com/autonomic/

Leymann, F (Ed.) (2001) WSFL Web services ﬂow

language (WSFL 1.0) Retrieved July 2003 from

http://www.ibm.com/software/solutions/webservices/pdf/WSFL.pdf

Liberty Alliance Project (2002) Retrieved November 2002from http://www.projectliberty.org/

Machiraju, V., Sahai, A., & van Moorsel, A (2002) Webservice management network: An overlay network forfederated service management Retrieved November

2002 from http://www.hpl.hp.com/techreports/2002/HPL-2002-234.html

Micropayments overview (2002) Retrieved ber 2002 from http://www.w3.org/ECommerce/Micropayments/

Novem-Microsoft NET Passport (2002) Retrieved November

Reagle, J (Ed.) (2000, October 6) XML encryption

require-ments Retrieved November 2002 from http://lists.w3.

org/Archives/Public/xml-encryption/2000Oct/att-0003/01-06-xml-encryption-req.html

RosettaNet (2002) Retrieved November 2002 fromhttp://www.rosettanet.org

Sahai, A., Machiraju, V., Sayal, M., Jin, L J., & Casati,

F (2002) Automated SLA monitoring for Web vices Retrieved November 2002 from http://www.hpl.hp.com/techreports/2002/HPL-2002-191.htmlSahai, A., Machiraju, V., & Wurster, K (2001, July).Monitoring and controlling Internet based services

ser-In Second IEEE Workshop on ser-Internet Applications

(WIAPP’01) New York: IEEE [Also as HP Tech Rep.

TSpaces: Intelligent Connectionware (1999) Retrieved

November 2002 from http://www.almaden.ibm.com/cs/TSpaces/

Van Moorsel, A (2001) Metrics for the Internet Age—

Quality of experience and quality of business

Re-trieved November 2002 from http://www.hpl.hp.com/techreports/2001/HPL-2001-179.html

Weber, R (1998) Chablis—Market analysis of digital

pay-ment systems Retrieved November 2002 from

Univer-sity of Munich Web site: muenchen.de/MStudy/x-a-marketpay.html

Trang 19

http://chablis.informatik.tu-Irie WL040/Bidgoli-Vol III-Ch-62 June 23, 2003 16:43 Char Count= 0

Web Site Design

Robert E Irie, SPAWAR Systems Center San Diego

Consistent Navigation Mechanism 773

Miscellaneous Interface Issues 773

Designing and implementing a Web site is increasingly

be-coming a complex task that requires knowledge not only

of software programming principles but of graphical and

user interface design techniques as well While good

de-sign is important in regular software engineering and

ap-plication development, nowhere is it more essential than

in Web site development, due to the diverse and dynamic

nature of Web content and the larger intended audience

This chapter will cover some of the issues involved with

the two major components of a Web site, its design and

implementation The scope of this chapter is necessarily

limited, as Web development is a rich and heterogeneous

ﬁeld A broad overview of techniques and technology is

given, with references to other chapters The reader is

di-rected to consult other chapters in this encyclopedia for

more detailed information about the relevant

technolo-gies and concepts mentioned below Occasionally links to

Web sites will be given They are either representative

ex-amples or suggestions for further reference, and should

not be construed as an endorsement

WEB SITE COMPONENTS

A Web site is an integration of three components, the

con-tent to be published on the Web, its presentation to the

user, and the underlying programming logic Each

com-ponent has its own particular representation and role in

shaping the overall user experience

Content

The content consists of all relevant data that are to be

pub-lished, or shown to the user It usually constitutes the bulk

of a Web site’s storage requirements and can be in the form

of text, images, binary and multimedia data, etc Statictextual and graphic content can be stored as HTML pages,whereas multimedia ﬁles like videos and sound recordingsare usually stored in large databases and served, in whole

or in parts, by dedicated servers Most of the discussion

in this chapter will focus on the former type

Presentation

The presentation component involves the user interface tothe Web site and the manner in which content is displayed.Typical elements include the graphical and structural lay-out of a Web document or page, text and graphic styles tohighlight particular content portions, and a mechanismfor the user to navigate the Web site Originally, files withHTML markups were used to store both content and in-formation regarding its presentation It is now commonpractice to store neither exclusively in HTML HTML isprimarily used to describe the structure of a Web docu-ment, by breaking down the page into distinct elementslike paragraphs, headings, tables, etc The actual textualcontent of the document can be stored separately in adatabase, to be dynamically inserted into the HTML pageusing programming logic A separate file, called the stylesheet, can be associated with the HTML page, and con-sequently the content, to affect the presentation A stylesheet file can describe how each structural element in anHTML file is displayed; sizes, colors, positions of fonts,blocks, backgrounds, etc., are all specified in a hierarchi-cal organization, using the standard Web-based style sheetlanguage, cascading style sheets (CSS) (Lie & Bos, 1999)

Logic

The programming logic determines which content todisplay, processes information entered by the user, and

768

Trang 20

Irie WL040/Bidgoli-Vol III-Ch-62 June 23, 2003 16:43 Char Count= 0

generates new data It drives the interaction between the

Web site and the user and is the glue that binds the content

and its presentation To be useful, it needs to access the

content as well as its presentation information and

han-dle user input accordingly Logic is usually implemented

as small programs, or scripts, that are executed on the

Web server or the user’s browser These scripts can be

stored within the HTML page, along with the presentation

and content, or separately as distinct program ﬁles that

are associated with the content There are several

stan-dard programming languages that can be used in writing

scripts

Separation of Components

With the existence of a variety of technologies, protocols,

and standards, Web development is remarkably ﬂexible,

and there are often multiple ways of accomplishing the

same task This is both an asset and a liability, as while

developers are free to choose their own techniques, it is

very easy to create sloppy or undisciplined documents and

code In regular application development, it is important

to adhere to sound software engineering techniques to

manage a code base for future enhancements and

simul-taneous development efforts The ﬂexibility of Web

deve-lopment makes such good techniques even more critical

Until very recently, there was a great deal of overlap,

in terms of storage and implementation, of the three Web

site components mentioned above This led, for example,

to Web pages that contained all three components in a

single, often unmanageable, HTML ﬁle As Web site

de-velopment has matured, the principle of Web site

com-ponent separation has become widely encouraged, if not

accepted, and it is the central theme of this chapter

IMPLEMENTATION ISSUES

The World Wide Web (WWW) is a series of client/server

interactions, where the client is the user’s Web browser,

and the server is a particular Web site The WWW

Con-sortium (W3C) deﬁnes the hypertext markup language

(HTML) and the hypertext transfer protocol (HTTP) as

the standard mechanisms by which content is published

and delivered on the Web, respectively

In essence, the local Web browser initiates HTTP quests to the remote Web server, based on user input

re-The Web server retrieves the particular content speciﬁed

by the requests and transmits it back to the browser as

an HTTP response The Web browser then interprets the

response and renders the received HTML content into a

user-viewable Web page

Web site implementations can be classiﬁed by the level

of interactivity and the way content is stored, retrieved,

and displayed

Static Sites

Static sites are the simplest type of Web sites, with the

content statically stored in HTML ﬁles, which are simple

text ﬁles Updating the Web site requires manually

chang-ing individual HTML text ﬁles While this type of site was

prevalent in the beginning, most sites, especially

commer-cial ones, have come to incorporate at least some degree

Figure 1: Block diagram of a client/server

archi-tecture with a static Web site

of dynamic behavior, and users have come to expect someinteractivity

Figure 1 shows the basic client–server interaction for astatic Web site The client browser makes an HTTP request

to a Web server The URL specifies the particular Webserver and page The Web server retrieves the requestedWeb page, which is an HTML file, from the file system andsends it back to the client through the HTTP response.This very basic interaction between browser and server

is the basis for more complex, dynamic interactions Thistype is static because the Web page content is straightHTML, statically stored on disk, with no mechanism tochange the contents The Web server here serves solely as

a ﬁle transfer program

Developing static sites requires very few tools All that

is required, besides the Web server and browser, is a texteditor application The simplest text editor can be used tomanually create HTML ﬁles Complex, graphical HTMLeditors can make the task almost trivial by automaticallygenerating HTML ﬁles and behaving similarly to wordprocessors, with WYSIWYG (what you see is what youget) interfaces Creating graphics and images for staticWeb sites is also straightforward, requiring any typicalpaint or drawing program

DYNAMIC SITES

Dynamic sites share the same basic architecture as staticones, but with the addition of programming logic The twomajor types of dynamic sites reﬂect the place of execution

of the scripts Client-side scripting involves embedding tual source code in HTML, which is executed by the clientbrowser in the context of the user’s computer Server-sidescripts, on the other hand, are executed on the Web server.While the following discussion examines both types sep-arately, in an actual Web site both types can and often doexist simultaneously

ac-Client Side

Figure 2 shows the basic architecture for a dynamic sitewith client-side scripting Scripts are embedded withinHTML documents with the <script> </script> tags orstored in separate documents on the server’s ﬁle system.Scripts are transmitted, without execution, to the clientbrowser along with the rest of the HTML page Whenthe client browser renders the HTML page, it also in-terprets and executes the client script An example of aclient-side script is the functionality that causes a userinterface element, such as a menu, to provide visual feed-back when the user moves the mouse pointer over a menuoption

Trang 21

W EB S ITE D ESIGN

770

Figure 2: Block diagram of a Web site interaction

with client-side scripting

There are several client-side scripting languages, the

most common one being JavaScript, an object-oriented

language originally developed by Netscape It is now a

sta-ndardized language, deﬁned by the international industry

group European Computer Manufacturers Association,

and called ECMAScript (European Computer

Manufac-turers Association, 1999) Netscape continues to use the

term JavaScript, however, and Microsoft calls its

imple-mentation of ECMAScript for Windows browsers JScript

The other major scripting language is Microsoft’s

VB-Script, short for Visual Basic Scripting Edition, which is

available only for Windows platforms (Champeon, 2001)

Regardless of the language, client-side scripts rely on a

standard programming interface, deﬁned by the W3C and

called the Document Object Model (DOM), to dynamically

access and update the content, structure, and style of Web

documents (World Wide Web Consortium, 1998)

Cascading style sheets (CSS) is another W3C language

standard that allows styles (e.g., fonts, colors, and

spac-ing) to be associated with HTML documents Any speciﬁc

HTML tag or group of HTML tags can be modiﬁed It is

a language, separate from HTML, that expresses style in

common desktop publishing terminology The

combina-tion of HTML, CSS, and DOM client-side scripts is often

referred to as dynamic HTML (Lie & Bos, 1999)

Client-side scripting is used primarily for dynamic user

interface elements, such as pull-down menus and

ani-mated buttons The advantage of using client-side scripts

instead of server-side scripts for such elements is that

the execution is more immediate Since the script, once

loaded from the server, is being executed by the browser

directly on the user’s computer, there are no delays

asso-ciated with the network or the server load This makes the

user interface responsive and similar to standard platform

applications

One of the disadvantages is that client-side scripting

languages are usually more limited in functionality than

server-side languages, so that complex processing is not

possible Such limitations are by design, for security

rea-sons, and are not usually apparent for simple user

inter-face programming

Users may also speciﬁcally choose not to allow

client-side scripts to execute on their computers, resulting in a

partial or complete reduction in functionality and

usabil-ity of a Web site In general, it is recommended that a site

incorporate user interface scripting only sparingly, and

always with clear and functional alternatives

Finally, because client-side programs, whether

embed-ded or stored separately, must necessarily be accessible

and viewable by the Web browser, they are also

ulti-mately viewable by the user This may not be desirable for

Figure 3: Block diagram of a Web site interaction with

Server Side

Figure 3 shows the basic architecture for a server-side namic site Scripts are still stored in HTML documents onthe server’s ﬁle system, but are now executed on the server,with only the program results and output being sent to theclient browser, along with the rest of the HTML page Tothe client browser, the HTTP response is a normal staticHTML Web page Scripts are embedded in HTML docu-ments using special HTML-like tags, or templates, whosesyntax depends on the particular server-side scripting lan-guage (Weissinger, 2000)

dy-There are several common server-side scripting guages, including PHP, Active Server Pages (ASP), andJava Server Pages (JSP) The common gateway interface(CGI) is also a server-side scripting mechanism, wherebyneither the Web content nor the programming logic isstored in an HTML ﬁle A separate program, stored in theﬁle system, dynamically generates the content The Webserver forwards HTTP request information from the clientbrowser to the program using the CGI interface Theprogram processes any relevant user input, generates anHTML Web document and returns the dynamic content

lan-to the browser via the Web server and the CGI interface.This process is illustrated in Figure 4

Server-side scripting is used primarily for complex andtime-consuming programming logic tasks, where imme-diacy of response is not as critical as with user interfaceelements The advantage of using server-side scripts is thefreedom and computational power that is available on the

Figure 4: Block diagram of a Web site interaction

with common gateway interface scripting

Trang 22

server; server-side scripts do not have the same security

constraints as client-side scripts, and often have full

ac-cess to the server machine’s ﬁle system and resources The

user may not disable execution of such scripts, so that the

Web developer can reasonably expect that the Web site

will behave exactly the same regardless of user

conﬁgu-ration Finally, any proprietary server-side source code is

safely hidden from user view, as the client browser

re-ceives only the output of the script

Server-side scripts have the disadvantage of requiring

a request–response round trip between the client browser

and the server, which leads to slower response times

Server-side scripting languages normally interact sely with the Web server, which imposes some compatibil-

clo-ity constraints The choice of a Web server, particularly a

proprietary system, usually limits the selection of

server-side scripting languages, and vice versa

WEB APPLICATIONS

As a Web site becomes more complex, a robust and

efﬁ-cient mechanism for the separation of content,

presenta-tion, and logic is necessary Web application servers are

Web sites that are more interactive, access large amounts

of data, and provide a rich functionality similar to that of

desktop applications Unlike desktop applications, where

all components are stored and executed on the same

com-puter, Web applications usually follow a three-tier

client/-server architecture (see Figure 5) consisting of the Web

browser, the Web server, and a database All content and

logic are stored in the database and are retrieved and

pro-cessed as necessary on the Web server The presentation

information can be embedded with the content or stored

as a separate style sheet on the database or the server

Usually a Web application server interfaces with a lational database, which stores data in rows of tables

re-Table 1 URLs of Various Web Resources

World Wide Web Consortium http://www.w3.org

Web Application Servers

BEA WebLogic http://www.beasys.com/products/weblogicIBM WebSphere http://www.ibm.com/software/webservers/appservMacromedia ColdFusion http://www.macromedia.com/software/coldfusionApache Jakarta http://jakarta.apache.org

The major disadvantage of developing with Web plication servers, besides the inherent complexity, is thenecessity of learning a nonstandard or proprietary server-side programming interface or language There are severalmajor Web application servers that support standard pro-gramming languages such as Java and C++, but each hasits own application programming interface (API) Table 1lists some of the popular commercial and open source ap-plication servers (see Web Resources)

ap-DESIGN ISSUES

Unlike implementation issues, which usually are forward to specify and quantify, design issues are muchmore subjective and are dependent on several factors, in-cluding the particular type of Web site and its purpose.Web site development efforts are often driven by con-ﬂicting objectives and considerations, and a balance must

straight-be maintained straight-between business and ﬁnancial concerns,

Trang 23

772

which often stress the commercial viability and

revenue-generating aspects of a Web site, and more user-centric

design concerns, which usually deal with usability issues

(Murray & Costanzo, 1999) Since the former are very

domain-speciﬁc, only the latter will be discussed in this

chapter In the discussion that follows, references to

sam-ple Web sites will be given

USABILITY ISSUES

The goal of designing a Web site with usability issues in

mind is to ensure that the users of the site ﬁnd it usable

and useful Speciﬁcally, a Web site should be accessible,

appealing, consistent, clear, simple, navigable, and

forgiv-ing of user errors (Murray & Costanzo, 1999)

The ﬁrst step in designing any Web site should be the

determination of the purpose of the site Too often the

rush to incorporate the latest Web technology or standard

prevents a thorough examination and determination of

the most important factor of the Web site, its intention or

purpose Most Web sites in essence are information

dis-semination mechanisms; their purpose is to publish

use-ful content to as wide an audience as possible Others also

have a commercial component, with the buying and

sell-ing of goods or services Still others foster a community

or group activity and are used as collaboration devices

The Web site’s purpose should drive the design and

im-plementation efforts A Web site advertising or describing

a company’s products will most likely need eye-catching

graphical designs and images A commerce site will need

to consider inventory mechanisms and secure

transac-tions A community site will need to solve problems

in-volving simultaneous collaboration of a distributed group

of users

It is also important to consider the intended audience

of a Web site There is a wide range in browser capabilities

and user technical competencies that must be taken into

account A Web site geared toward a younger, more

tech-nically inclined audience may contain highly interactive

and colorful designs, whereas a corporate site might want

to have a more professional, businesslike appearance It

is generally a good practice, if not essential, to consider

accessibility issues for all users, including those who do

not have access to high-end graphics-capable browsers

BASIC WEB SITE TYPES

Just as there are several implementation classiﬁcations

for Web sites, we can also classify them based on their

purpose Each type will lead to different choices in the

content, presentation, and logic components and require

emphasis on different usability issues A single Web site

may incorporate features of more than one basic type

News/Information Dissemination

This type of Web site design is geared toward providing

informational content to the Web user The content is

usu-ally textual in form, with some graphics or images The

presentation of the content and its navigation are kept as

clear and consistent as possible, so that the user will be

able to quickly access the desired information Not

sur-prisingly, newspaper companies usually have Web sites

with online news content (e.g., http:/./www.nytimes.com)

to incorporate more community-like features to promptusers to return to their sites (e.g., http://www.yahoo.com)

Community

Community sites foster interaction among their usersand provide basic collaboration or discussion capabili-ties Message boards, online chats, and ﬁle sharing areall typical functionalities of community sites The opensource software movement has promoted numerous Websites based on this type (e.g., http://www.sourceforge.net)

Search

There is a lot of overlap between this type of Web sites andportals Like portals, search sites provide a mechanism bywhich users discover other Web sites to explore Some so-phisticated programming logic, the search engine, formsthe foundation of this type of Web site Search sitesoften emphasize simple, almost minimalist interfaces(e.g., http://www.google.com)

E-commerce

This type of site is often a component of other Web sitetypes and allows users to purchase or sell goods and ser-vices in a secure manner Since potentially large amounts

of currency are involved, security is an important eration, as well as an interface that is tolerant of potentialuser errors An example of a successful commerce sitewith elements of a community is http://www.ebay.com

consid-Company/Product Information

With widespread Web use, having an ofﬁcial Web ence is almost a requirement for corporations Such sitesusually serve purposes similar to those of informationaland e-commerce sites, but with a more focused interface,reﬂecting the corporate image or logo (e.g., http://www.microsoft.com)

pres-Entertainment

This type of site is usually highly interactive and stressesappealing, eye-catching interfaces and designs Typicalapplications include online gaming sites, where users mayplay games with each other through the site, and sportingevent sites, where users may view streaming content inthe form of video or audio broadcasts of live events (e.g.,http://play.games.com)

BASIC DESIGN ELEMENTS

There is obviously no single best design for a Web site,even if one works within a single type There are, however,some guidelines that have gained common acceptance.Like any creative process, Web site design is a matter oftradeoffs A typical usability tradeoff is between making

Trang 24

an interface appealing and interactive and making it clear

and simple The former usually involves graphical designs

with animations and client-side scripting, whereas the

lat-ter favors minimal text-based inlat-terfaces Where a

particu-lar Web site belongs on the continuous spectrum between

the two extremes depends on its intended purpose and

au-dience, and should be a subjective, yet conscious decision

The safest design decision is to always offer tives, usually divided into high- and low-bandwidth ver-

alterna-sions of the same Web pages, so that the user experience

can be tailored to suit different preferences The major

disadvantage of this is the increase in development time

and management requirements

Accessibility/Connectivity

The two major factors affecting accessibility and

connec-tivity issues are the bandwidth of the user’s network

con-nection, and the particular graphical capabilities of the

user browser Low-bandwidth connections to the Internet

are still very common in homes By some measures, dialup

modems are still used in 90% of all homes that regularly

access the Internet (Marcus, 2001) This requires Web site

designers either to incorporate only small, compressed

images on their sites, or to provide alternative versions of

pages, for both high- and low-bandwidth users

Some user browsers do not have any graphics ity at all, for accessibility reasons or user preference For

capabil-example, visually impaired users and PDA (personal

digi-tal assistant) users most often require accessibility

consid-eration Estimates of the number of disabled users range

from 4 to 17% of the total online population (Solomon,

2000) PDA and mobile phone Internet usage is relatively

new in the United States, but is already approaching

10 million users (comScore Networks, 2002) For such

users, designing a separate text-only version of the Web

site is a possibility What would be better is to design a Web

site that contains automatic browser-speciﬁc

functiona-lity degradation An example is to associate relevant

tex-tual content to graphical images; graphical browsers may

display the images, while text browsers may display the

descriptions

Consistent Page Layout

One of the most important design for a Web site is a

con-sistent page layout While every single page does not need

to have the same layout, the more consistent each page

looks, the more straightforward it is for the user to

nav-igate through the site and the more distinctive the Web

site appears A typical Web page layout utilizes parts or

all of an artiﬁcially deﬁned border around the content (see

Figure 6)

Originally, HTML frames or tables were the standardway of laying out a page, and they are still the preferred

method for most developers However, the W3C clearly is

favoring the use of cascading style sheets (CSS) for page

layout (World Wide Web Consortium, 2002) CSS also

pro-vides a mechanism for associating styles, such as color,

font type and size, and positioning, with Web content,

without actually embedding them in it This is in keeping

with the principle of separating content from its

presen-tation

Figure 6: A typical layout scheme for

a Web page

Consistent Navigation Mechanism

Web site navigation is an important component of the sign, and a consistent navigation mechanism supplements

de-a pde-age lde-ayout de-and mde-akes the user experience much pler and more enjoyable

sim-One of the best ways of providing a consistent gation mechanism is to have a menu bar or panel that

navi-is consnavi-istent across all pages of the site Such a menucan be a static collection of links, or a dynamic, interac-tive component similar to that of a desktop application.Figure 7 is an example of a simple and consistent naviga-tion scheme that utilizes two menu panels The top panel(with menu items A, B, C) is similar to a desktop appli-cation’s menu bar and is a global menu that is consistentthroughout the site and refers to all top-level pages of asite The left side panel is a context-dependent menu thatprovides further options for each top-level page This type

of navigation scheme can be seen on several public Websites (e.g., http://www.python.org)

While there are no absolute rules or guidelines for goodnavigation elements, they usually provide visual feedback(e.g., mouse rollover effects), have alternate text displays(for nongraphical or reduced capability browsers), andare designed to be easy to use as well as learn

MISCELLANEOUS INTERFACE ISSUES

The following are miscellaneous interface design issues.Again, only suggestions for possible design choices are of-fered, and they will not be applicable in all circumstances

Figure 7: An example of a consistent navigation scheme for

a Web site

Trang 25

774

Graphics

The two major interface issues concerning graphics are

size and color As the majority of Web users still access

Web sites over slow modem links, it is important to use

graphic images that are of reasonable size, to prevent

excessive download delays For photograph images, the

JPEG format offers a good compromise between lossy

compression size and image quality, with an adjustable

tradeoff point For line art and solid color images, lossless

compression is preferred, and the proprietary GIF format

is common, although the open standard PNG format is

gaining in acceptance (Roelofs, 2000)

The issue of colors is a complex one and depends on

many factors In general, Web graphic designers work

with 24-bit colors, with 256 possible values for each of

three color channels, red, green, and blue (RGB) Until

re-cently, the majority of users’ computers and Web browsers

could only support a palette, or set, of 256 colors

simul-taneously To ensure that colors appear uniformly across

platforms and browsers, a “Web-safe palette” of 216 colors

was established, consisting of combinations of six

pos-sible values, or points, for each of three color channels

(6 possible reds× 6 possible greens × 6 possible blues =

216 possible colors) (Niederst, 2001)

Recently, browsers and systems with 24-bit and 16-bit

support have drastically increased and now account for

about 94% of all users (Lehn & Stern, 2000)

Twenty-four-bit color support results in the full display of the designer’s

color scheme Sixteen-bit color systems are sometimes

problematic, as they nonuniformly sample the three color

channels (5 bits for red, 6 bits for green, and 5 bits for blue)

and provide a nonpalettized approximation of 24-bit color

Layout Styles

A comprehensive guide to layout styles is beyond the scope

of this chapter The major design decision is between

hav-ing page layouts of ﬁxed or variable size (Niederst, 2001)

By default, HTML documents are variable-sized, in that

text and graphics positioning and line breaks are not

deter-mined by the user’s monitor resolution and browser

win-dow size Since a wide variety of sizes and resolutions is

almost a given, having a variable-sized page layout allows

ﬂexible designs that scale to the capabilities and

prefer-ences of each user The disadvantage is that because each

user experience is different, and elements can be resized

or repositioned at will, it is difﬁcult to design a consistent

and coherent interface; there is the possibility that some

conﬁgurations lead to poor or unusable interfaces

The alternative to the default variable-sized page

lay-out is to explicitly design the size and position of some

or all of the elements of a Web document An example of

this would be to limit the width of all content in a page

to ﬁt within a certain horizontal screen resolution, such

as 640 pixels All text and graphics will remain stationary

even if the user resizes the browser window to greater

than 640 horizontal pixels The advantage of this method

is that designing an interface is much more deterministic,

so the Web designer will have some degree of control over

the overall presentation and is reasonably certain that all

users will have the same experience accessing the site

The disadvantage is that the designer must pick constants

that may not be pleasant or valid for all users For ample, a Web page designed for a 640× 480 resolutionscreen will look small and limited on a 1280× 1024 screen,whereas a Web page designed for an 800× 600 screenwould be clipped or unusable for users with only a 640×

ex-480 screen

Actually implementing either type of page layout can

be done with HTML frames, tables, or CSS style sheets,

or some combination of the three Although using stylesheets is the currently preferred method for page lay-out, browser support is still poor, and many sites still useframes or tables (Niederst, 2001)

Search Engines

A search engine is a useful tool to help users quicklyﬁnd particular content or page as the content of a Website increases, or the navigation scheme becomes com-plicated The search engine is a server-side software pro-gram, often integrated with the Web server, that indexes

a site’s Web content for efﬁcient and quick retrieval based

on a keyword or phrase Search engines are available with

a variety of conﬁgurations, interfaces, and capabilities

A good resource that summarizes the major commercialand open source engines is the Search Tools Web site(http://www.searchtools.com)

Cross-Browser Support

Designing a Web site that is consistent across multiplebrowsers and platforms is one of the most challengingtasks a developer faces Even different versions of thesame browser are sometimes incompatible At the mini-mum, the three major browsers to consider are InternetExplorer (IE), Netscape Navigator (NN), and text-basedbrowsers such as Lynx

For the most part, browser development and ties have preceded the establishment of formal standards

capabili-by the W3C, leading to numerous incompatibilities andnonuniform feature support The latest versions of the twocommon browsers (IE 6, NN 6.2) offer complete supportfor the current W3C standard HTML 4.01 However, themore common, earlier versions of the browsers (versions

4+ and 5+) had only incomplete support

Even more troublesome was their support of the W3Cstandard Document Object Model (DOM) Level 1, as eachhas historically taken a different track and implementedits own incompatible DOM features (Ginsburg, 1999)

In general, NN’s DOM support is much closer to the

“official” W3C DOM Level 1 specification, whereas IE hasseveral extensions that are more powerful, but are avail-able only on Windows platforms The latest versions of thetwo browsers have alleviated some of this problem by sup-porting, as a baseline, the complete Level 1 specification

WEB RESOURCES

Table 1 summarizes some useful online resources for Website development They are only suggestions and shouldnot be considered comprehensive or authoritative

CONCLUSION

This chapter has given an overview of Web site ment, including the design and implementation aspects

Trang 26

develop-Irie WL040/Bidgoli-Vol III-Ch-62 June 23, 2003 16:43 Char Count= 0

This ﬁeld is very dynamic, and technologies and practices

are constantly changing More complex object-oriented

programming paradigms and generalized markup

lan-guages are gaining widespread acceptance and use

XML (extensible markup language), XHTML (extensible

HTML), XML-RPC (XML remote procedure call), SOAP

(simple object access protocol), and SVG (scalable vector

graphics) are examples of such new standards However,

the basic principles of clarity, consistency, and

concise-ness are still applicable to the design of all sites regardless

of type or technology

The Web development profession is also rapidly ing ﬁeld No longer is it feasible to have one person per-

chang-form all design and implementation duties A team of Web

developers, graphic designers, and database

administra-tors is usually required, with each member responsible for

the three components of Web site development: content

management, content presentation, and programming

logic However, it is still important to be aware of all

issues in order to work effectively in a Web development

team

GLOSSARY

Client/server architecture A process by which multiple

computers communicate The client initiates all munication with the server in the form of a request andreceives the results in the form of a response For Websites, the user’s browser is the client requesting content

com-or services from the Web server

Database A repository of information The data are

stored in a structured way to be easily and efﬁcientlyretrieved Two popular types of databases are the rela-tional database and the object-oriented database Eachhas advantages and disadvantages with respect to efﬁ-ciency, rich associations between information, etc

Hypertext A mechanism by which related content (text,

graphic, multimedia, etc.) is associated using links Ahypertext document allows the user to easily accessrelevant content in a seamless, integrated context, asopposed to traditional, sequentially viewed documents

Hypertext markup language (HTML) A standard

lan-guage for publishing content on the World Wide Web

HTML deﬁnes a set of markups, or tags, that are bedded in a Web document and provide structural,stylistic, and content information

em-Uniform resource locator (URL) The explicit format

for a reference to a hypertext document It is in the form

protocol://server:port/path The protocol can be any of

several standard Internet communications protocols,with HTTP being the most common for Web pages

By default, Web servers communicate using a standardport number, 80 In such cases the URL can be short-

ened to protocol://server/path.

User Anyone accessing the Web site, using a Web

bro-wser A related term, user interface, refers to the tire environment (text, graphics, and user input andresponse) that builds the experience for the user inter-acting with the site

en-Web site The integration of hypertext content,

presen-tation information, and controlling logic, that formsthe user experience Implemented on a Web server, its

purpose is usually to disseminate information, fostercollaboration, or obtain user input It is the basic unit

of discussion in this chapter and will refer to both theuser experience and the actual implementation

World Wide Web (WWW) A network of hypertext uments, existing on Web servers and accessible viathe Internet using computer programs called Webbrowsers

doc-CROSS REFERENCES

See Client/Server Computing; Databases on the Web;

HTML/XHTML (HyperText Markup Language/Extensible HyperText Markup Language); Usability Testing: An Eval- uation Process for Internet Communications.

REFERENCES

Champeon, S (2001) JavaScript: How did we get here?Retrieved April 16, 2002, from http://www.oreillynet.com/pub/a/javascript/2001/04/06/js history.htmlcomScore Networks (2002) Ten million Internet users

go online via a cellphone or PDA, reports ComscoreMedia Metrix Press Release Retrieved August 30, 2002,from http://www.comscore.com/news/cell pda 082802.htm

European Computer Manufacturers Association (1999).Standard ECMA-262: ECMAScript language speciﬁca-tion Retrieved April 2, 2002, from ftp://ftp.ecma.ch/ecma-st/Ecma-262.pdf

Ginsburg, P E (1999) Building for 5.0 browsers trieved May 10, 2002, from http://builder.cnet.com/webbuilding/pages/Authoring/Browsers50

Re-Lehn, D., & Stern, H (2000) Death of the websafe colorpalette? Retrieved April 20, 2002, from http://hotwired.lycos.com/webmonkey/00/37/index2a.html?tw=designLie, H W., & Bos, B (1999) Cascading style sheets, level

1 W3C recommendation Retrieved April 2, 2002, fromhttp://www.w3.org/TR/REC-CSS1

Marcus, B (2001) Wireless, broadband penetrationcontinues Retrieved May 1, 2002, from http://www.digitrends.net/nwna/index 15935.html

Murray, G., & Costanzo, T (1999) Usability and the Web:

An overview Retrieved April 16, 2002, from http://www.nlc-bnc.ca/9/1/p1-260-e.html

Niederst, J (2001) Web design in a nutshell (2nd ed.)

Se-bastopol, CA: O’Reilly & Associates

Roelofs, G (2000) PNG, MNG, JNG, and Mozilla M17 trieved August 30, 2002, from http://www.libpng.org/pub/png/slashpng-2000.html

Solomon, K (2000) Smart biz: Enabling the disabled trieved May 1, 2002, from http://www.wired.com/news/print/0,1294,39563,00.html

Re-Weissinger, A K (2000) ASP in a nutshell (2nd ed.)

Se-bastopol, CA: O’Reilly & Associates

World Wide Web Consortium (1998) Document ject model (DOM) level 1 speciﬁcation RetrievedApril 2, 2002, from http://www.w3.org/TR/REC-DOM-Level-1

ob-World Wide Web Consortium (2002) Hypertext markuplanguage (HTML) home page Retrieved August 28,

2002, from http://www.w3.org/MarkUp

Trang 27

DeNoia WL040/Bidgolio-Vol I WL040-Sample.cls June 20, 2003 17:57 Char Count= 0

Wide Area and Metropolitan Area Networks

Lynn A DeNoia, Rensselaer Polytechnic Institute

Facilities and Infrastructure 778

Differences around the World 780

Switching, Routing, and Signaling 781

Carriers and Service Providers 788Class of Service, Quality of Service 789

In today’s social, political, and economic environment,

individuals and organizations communicate and operate

over ever-increasing geographic distances This means

that access to and sharing of information and resources

must extend beyond the “local” ofﬁce, building, or

cam-pus out across cities, states, regions, nations, continents,

and even beyond the planet Bridging this diversity of

dis-tances in ways that satisfy application requirements for

speed, capacity, quality, timeliness, etc at reasonable cost

is no simple challenge, from either a technical or a

busi-ness perspective In this chapter we concentrate on the

main elements required to meet such a challenge in wide

area and metropolitan area networks

HISTORY AND CONTEXT

Deﬁnitions

The public networking arena has typically been divided

into two segments with the following characteristics:

Metropolitan area networks (MANs) are built and

op-erated by service providers (SPs) who offer network

ser-vices to subscribers for a fee, covering distances up to

tens of miles, often within or surrounding a major city

MANs are often built by telecommunication companies

such as local exchange carriers (LECs) or by utility

com-panies A recent alternative using Ethernet for the MAN

has spawned a new category of companies called Ethernet

LECs or ELECs

Wide area networks (WANs) are built and operated

by SPs who offer network services to subscribers for a

fee, covering distances up to hundreds or thousands of

miles, such as between cities, across or between

coun-tries, across oceans, etc WANs designed for voice are

usu-ally built by telecommunication companies known in the

United States as interexchange carriers (IXCs) WANs for

data are also called public data networks (PDNs)

By contrast, local area networks (LANs) are typically

built and operated as private networks, by individuals

or enterprises, for their own use In addition, landlordsoperating as building LECs (BLECs) may offer LAN ser-vices to tenants In either case, the geographic scope of

a LAN is usually limited to a building or campus ronment where all rights of way for cabling purposes be-long to the individual/enterprise/landlord The boundariesbetween LANs and MANs and WANs began to blur as ge-ographic limitations of networking technologies were ex-tended with increasingly capable implementations overﬁber-optic cabling Even the distinctions between privateand public networks became more difﬁcult to draw withthe advent of “virtual private network” equipment andservices

envi-Challenges

The number of options and choices available to networkdesigners in both the subscriber and provider communi-ties continues to grow for both MANs and WANs Multi-ple technologies and standards, increasing numbers andtypes of applications, higher expectations for MAN andWAN performance comparable to (or at least approach-ing) that found in a LAN environment, and pressure tokeep unit costs low, all combine to create enormous chal-lenges for MAN and WAN builders Infrastructure choicesmust last long enough, not just for cost recovery, but

to achieve return on investment Service providers mustmarry new technologies to their existing installed base,create smooth transitions (e.g., for network upgrades, newservice roll-outs) with minimal disruption to customerservices, and add or enhance services to meet advancingcustomer expectations, all in an environment of increas-ing economic and competitive pressure Many providershave begun to recognize that their long-term survival de-pends on a strategy of simpliﬁcation—reducing the com-plexity (to have fewer technologies, fewer equipment ven-dors, fewer equipment types, fewer management systems,etc.) of their infrastructure while maintaining the ﬂexibil-ity to adapt to changing application, user, and competitiverequirements

776

Trang 28

The pressure to simplify is constantly at odds with thedifﬁculties of predicting the future:

Which technology will provide the best ﬂexibility and

scal-ability at an acceptable investment cost?

How fast and in what ways will application needs and user

expectations develop?

Which services or enhancements will provide competitive

advantage?

How can value be added to those elements moving

down-ward into the commodity market?

The ability to develop shrewd answers to such tions is likely to determine which companies will thrive

ques-in the networkques-ing services busques-iness

Functional Requirements

The basic function that subscribers seek from MAN and

WAN service providers is the ability to deliver trafﬁc from

one place to another (point-to-point) or to multiple

oth-ers (multipoint) This begins with connectivity For the

network in Figure 1, trafﬁc can ﬂow from A to C and/or D,

but not to B Once connectivity is established, the network

must have sufﬁcient capacity in bandwidth and

switch-ing to get the trafﬁc from the source to its intended

des-tination Subscribers want services that are reliable, as

measured by the percentage of time network resources

are available when needed and by the amount of trafﬁc

(preferably none) that gets lost Subscribers also want

ser-vices that perform well enough so that their trafﬁc gets

delivered in a timely fashion, with minimal delay (low

latency is particularly important for delay-sensitive

traf-ﬁc such as voice or video) Providers, on the other hand,

want an infrastructure that is cost-effective, manageable,

and capable of supporting revenue generation and proﬁts

Evolution and Coexistence

The ﬁrst WANs were built from circuit-switched

connec-tions in the telephone system because that’s what was

available to cover the distances involved Circuit

switch-ing continues to be useful, particularly when the computer

MAN or WAN

D

CB

s

q

r

Figure 2: Connections, a)

point-to-point and b) multipoint-to-point

devices being connected need to exchange messages inreal time or with guaranteed delivery For occasional traf-

ﬁc, dial-up connections similar to an individual telephonecall are used For continuous trafﬁc or when applicationscannot tolerate the delay involved in call setup, circuitsare leased from a telephone company and “nailed up” intopermanent connections For two connected locations the

leased line is called a point-to-point connection (Figure 2a) More than two locations can be connected with a mul-

tipoint link (Figure 2b) if a sharing discipline is imposed

to prevent traffic from one source interfering with trafficsent from another at the same time In either case, theresources required to carry traffic across the leased lineare dedicated to the particular subscriber, creating an ef-fectively private connection through the service provider’spublic network resources

Two devices connected by a leased line may or maynot send trafﬁc continuously, wasting capacity when theline is idle If there are multiple devices in one location

to be connected to one or more devices in a destinationlocation, a single leased line may be shared using a de-vice at each end of the line called a multiplexer Statisticalmultiplexing allows more devices to be connected thanthe capacity of the line could support in real time if allwere to transmit simultaneously This is called oversub-scription On the average, it is quite likely that only somedevices will be active, and the line is shared effectivelywith little trafﬁc delay and less wasted capacity However,when many devices are active, performance can be de-graded The sending multiplexer adds a label to each unit

of traffic transmitted; the receiver reads (and removes) thelabel to figure out which device is the intended recipientand switches the traffic onto the appropriate output link.Packet switching is a form of statistical multiplexing.Originally circuit switching was designed to carry ana-log voice traffic and packet switching was designed for dig-ital data Today, however, public networks convert all types

of trafﬁc into digital form for cost-effective transport Wecould say that “bits are bits,” whether they belong to voice,data, video, or some other application The same networkmight well be used to deliver multiple types of bits, instead

of having distinct networks dedicated for voice, data, etc

This is the concept of convergence, where a single network

carries various types of trafﬁc In the context of gence, the important question shifts from whether circuit

conver-or packet switching is better, to what suppconver-ort a netwconver-ork

Trang 29

W IDE A REA AND M ETROPOLITAN A REA N ETWORKS

778

must provide so that trafﬁc delivery meets user

expecta-tions and application requirements Convergence is

cer-tainly not new, because in early WANs, digital data were

transformed into analog signals and carried over public

networks that had been designed for voice Today

conver-gence is available through many more options for what

trafﬁc to combine and how to do it

FACILITIES AND INFRASTRUCTURE

Digital Transmission

The heritage of digital WANs dates from the early 1960s,

when the Bell System ﬁrst introduced the T-carrier

sys-tem of physical components to support transport of

digi-tal signals in the United States The accompanying

time-division multiplexed (TDM) digital signal scheme, called a

digital hierarchy, was based on a standard 64-kilobits per

second (Kbps) signal designed to carry one analog voice

signal transformed by pulse-code modulation (PCM) into

digital form This basic unit is known as DS0 The

Inter-national Telecommunication Union (ITU) now supports

an entire set of digital signaling standards (Table 1),

in-corporating elements from the North American (United

States/Canada), European, and Japanese standard

hierar-chies

The traditional U.S multiplexing hierarchy began with

combining 24 DS0-level signals into one DS1 It is

com-monly called a T1 stream, and consists of a sequence of 24

channels combined to create one frame Each channel is

ﬁlled with 8 bits (an octet or byte) representing one PCM

sample A particular challenge of the time was to ensure

synchronization between transmitter and receiver, which

can be accomplished in several ways For example, each

frame could be introduced by a unique starting sequence

of 12 bits to allow receiver synchronization to be renewed

on a frame by frame basis The U.S designers decided

in-stead to distribute the 12 bits over 12 frames, reducing

transmission overhead at the expense of receiver

com-plexity The 12-frame sequence was called a superframe

With improved hardware, synchronization is more easily

maintained over longer periods, and an extended

super-frame (ESF) has replaced the supersuper-frame ESF comprises

24 frames but only needs 6 bits for synchronization,

free-Table 1 Digital Signal Hierarchy

Capacity Number Designation (Mbps) of DS0s

manage-In the European scheme (also used by other countries

such as Mexico), the basic E1 stream aggregates 32 PCM

channels Rather than adding synchronization bits, E1dedicates the ﬁrst PCM channel for synchronization andthe 17th for management and control signaling

Optical Fiber Systems

Service providers first used digital multiplexing withintheir own networks (e.g., trunking between Central Of-fices), to improve the return on and extend the life of theircopper cable infrastructure investments By the 1980s,however, interest had shifted to fiber optics for longer dis-tance, higher speed communications Standards were de-fined for the Synchronous Optical Network (SONET in theUnited States, equivalent to the Synchronous Digital Hier-archy, SDH, in Europe and elsewhere) to carry TDM trafficcost-effectively and reliably over metropolitan and widearea distances Today SONET specifies both a standardoptical interface signal and a digital signaling hierarchytailored to the fiber transmission environment The hier-archy is based on an 810-octet frame transmitted every

125 microseconds (µs) to create synchronous transport

signal-level 1 (STS-1) for electrical signals Each octet is

equivalent to a 64-Kbps PCM channel For ﬁber

transmis-sion, the STS-1 equivalent is optical carrier-level 1 (OC-1).

Higher level signals are formed from speciﬁc multiples

of OC-1 (Table 2) Each SONET frame is structured intotransport overhead and a synchronous payload envelope(SPE), which consists of both path overhead and payload

It is only the payload portion that carries subscriber trafﬁc

to be routed and delivered through the SONET network.The major building blocks for SONET networks are thepoint-to-point multiplexer, and for point-to-multipointconfigurations, the add-drop multiplexer (ADM) In par-ticular, the ADM allows traffic to be dropped off and theresultant free capacity to be reused to carry traffic enter-ing the network at that point SONET ADMs can also beemployed to create highly survivable networks that max-imize availability using diverse routing and self-healing,survivable ring structures Figure 3a shows a dual-ringstructure where the network accommodates loss of a link

Table 2 Basic SONET Levels Designation Line rate SDH equivalent

Trang 30

a) Lost link

b) Lost nodeX

X

Figure 3: SONET ring survivability, a) lost link

and b) lost node

by looping trafﬁc back on each side of the break, and

Fig-ure 3b shows how loss of a network node can be handled

similarly SONET has been deployed extensively by

ser-vice providers in metropolitan areas to create highly

re-liable and scalable transport capabilities Once the ﬁber

and switching equipment are in place, transport

capac-ity can be increased by installing higher-speed signaling

interfaces

Another approach to increasing the capacity of ﬁbersystems has become available with advances in optical

component technology Rather than using the entire range

of wavelengths that can be carried over ﬁber as a

sin-gle transmission channel, newer equipment allows us to

divide the range into multiple channels for

simultane-ous transmission using wavelength-division multiplexing

(WDM) This is quite similar to sending multiple television

channels over a coaxial cable Channels must be spaced

far enough apart to limit the interference between

adja-cent signals that would degrade signal quality In coarse

WDM (CWDM) the channels are widely spaced; for dense

WDM (DWDM), they are very close together (spacing≤

25–50 GHz) By combining WDM and high-speed

sig-naling, transmission capacities of OC-192, OC-768, and

greater become possible, limited primarily by the quality

of existing ﬁber installations

Access Technologies

In order to get trafﬁc in and out of a MAN or WAN,

sub-scribers must have physical connections, or access, to the

appropriate service provider’s network resources In the

regulated telecommunications environment of the United

States, this typically means acquiring connectivity from a

LEC to tie the subscriber’s physical premises to a WANservice provider’s (i.e., IXC’s) equipment as physically lo-cated in a point of presence (POP) In a metropolitanarea, a single company may be allowed to provide bothlocal exchange connections and MAN services The pri-mary means of accessing MAN and WAN service providernetworks are described below

Dial-up

Dial-up access is appropriate for occasional connections

of limited duration, as for making a telephone call Wherethe physical facilities used for dial-up were designed andinstalled to support analog voice trafﬁc, two characteris-tics are particularly important for data networking:

Digital data must be converted to analog using a modem

at the subscriber end and reconverted to digital by amodem at the provider end of the connection.Data rates are limited by the analog frequency range ac-cepted at provider receiving equipment and by the sig-nal modulation techniques of the modem devices Themost widely accepted standards today support maxi-mum data rates of 56 Kbps

Leased Line

Leased-line access is more appropriate for connectionsthat need to be continuous and/or of better quality forhigher-speed transmission Such facilities are dedicated

to the use of a speciﬁc subscriber For example, a businessmay lease a T1 access line as its basic unit of connectioncapacity (1.544 Mbps), particularly for access to Internetservice providers Fractional-T1 and multiple-T1 lines arealso available in some areas A newer technology designed

to be digital from end to end over copper cabling, calleddigital subscriber line (DSL), is beginning to be offered

as a lower-cost alternative to the traditional T-carrier.Leased-line access requires matching equipment at eachend of the line (subscriber and service provider) to ensuretransmission quality suitable to the desired data rates

Wireless

Wireless access is growing in popularity among mobileindividuals who do not work from a ﬁxed desktop in a sin-gle building location (e.g., salespeople, customer servicerepresentatives, and travelers) Rather than having to ﬁnd

a suitable “land-line” telephone connection with an log data port to connect the modem, wireless users haveeither wireless network interface cards or data interfacecables that connect their modems to cellular telephones.Both approaches require proximity to a wireless receivingstation of matching technology that is then connected tothe wired resources making up the remainder of the MAN

ana-or WAN

Cable Modem

Cable modem access is provided by cable television panies who have expanded their business into data net-working A modem designed to transmit data signals overcoaxial, broadband television cable is connected, usuallyvia Ethernet technology, to the subscriber’s internal net-work or computer equipment In residential applications,

Trang 31

com-DeNoia WL040/Bidgolio-Vol I WL040-Sample.cls June 20, 2003 17:57 Char Count= 0

780

subscribers in a neighborhood typically share data

net-working capacity on the aggregate cable that carries trafﬁc

back to the provider’s central service location This is

dif-ferent from a leased-line approach, where access capacity

is dedicated from each subscriber location all the way to

the provider’s POP In the United States, cable providers

are regulated differently from other public

telecommu-nications providers, and may not suffer the same

conse-quences for unavailable service

Management

Management for MANs and WANs typically began with

proprietary systems sold to service providers by each

man-ufacturer of telecommunications switching equipment

Networks composed of equipment from multiple vendors

thus contained multiple management systems

Equip-ment manageEquip-ment and service manageEquip-ment functions

are often tied together by an Operations Support System

(OSS) in order to automate operations (e.g., performance

monitoring), administration (e.g., ordering and billing),

maintenance (e.g., diagnostics, fault detection and

iso-lation), and provisioning (OAM&P) functions Many

ser-vice providers tailored what they could acquire as a basic

OSS in order to accommodate their own speciﬁc sets of

equipment and services, making it difﬁcult to share

infor-mation, provide consistent management data in a

mul-tiprovider environment, and keep up to date with new

functional requirements This often leaves customers who

need services from multiple providers without a single,

coherent view of their enterprise WAN resources

Beginning in 1988, the Telecommunication

Standard-ization sector of the International Telecommunication

Union (ITU-T, formerly the Consultative Committee on

In-ternational Telephony and Telegraphy, CCITT) set about

establishing the precepts for a standard

Telecommunica-tions Management Network (TMN) While the concept of

a TMN encompasses the entire set of OAM&P

applica-tions in the network, what they do, and how they

com-municate, ITU-T standards focus on the information

re-quired and how it should be communicated rather than

how it is processed (M.3000 recommendation series) Two

types of telecommunications resources are encompassed:

managed systems (such as a switch), which are called

network elements (NE), and management systems,

usu-ally implemented as operations systems (OS) TMN

stan-dards are organized into interface speciﬁcations that

de-ﬁne the interconnection relationships possible between

resources Figure 4 shows the relationship between the

TMN and the telecommunication network for which it is

responsible

TMN is based on the Open Systems

Interconnec-tion (OSI) management framework, using object-oriented

principles and standard interfaces to deﬁne

communica-tion for purposes of managing the network The primary

interface speciﬁcation, Q3, allows direct communication

with an OS Any network component that does not

im-plement Q3 may not access an OS directly, but must go

through a mediation device (MD) instead Legacy

equip-ment and systems that rely on proprietary ASCII messages

for communication are accommodated by means of a

Q-adapter (QA) that can translate between messages

NE

OSData

network

OS

Telecommunicationnetwork

TMNQ3

Figure 4: TMN and the network it manages.

representing the legacy information model and theobject-oriented representation expected in today’s TMN.TMN deﬁnes a layered architecture (ITU-T standardM.3010) as a logical model for the functions involved

in managing a telecommunication network effectively(Table 3) The object is to create a framework for inter-operability across heterogeneous operation systems andtelecommunication networks that is flexible, scalable, re-liable, easy to enhance, and ultimately, inexpensive to op-erate Standard management services have been definedfor alarm surveillance (Q.821), performance management(Q.822), traffic management (Q.823), ISDN service profilemanagement (Q.824), call detail recording (Q.825), androuting management (Q.826)

Differences around the World

Creating and operating WANs or MANs in different tries may present challenges well beyond identifying a ser-vice provider and getting connections established A par-ticular type of service may not be available in the desiredlocation, or a single provider may not offer services in ev-ery location, or the capacity required may not be available.Such differences may be due to telecommunication infras-tructure of varying ages and technologies, or to differentregulations on service offerings in various countries Forexample, T1 service is readily available in most U.S cities.Mexico, however, employs the European standard hier-archy Thus E1 service would need to be ordered (if it isavailable) to connect a business location in Mexico to one

coun-in the United States, and the differences coun-in capacity andframing would have to be handled appropriately by thenetwork equipment at each end of the link

In some countries, telecommunication is a regulatedindustry subject to many government-imposed rules, and

Trang 32

Table 3 TMN Architecture Logical layer Functional responsibilities

Business management Provides an enterprise view that incorporates high-level, business planning

and supports setting goals, establishing budgets, tracking ﬁnancialmetrics, and managing resources such as products and people

Service management Provides the basic contact point for customers (provisioning, billing and

accounting, troubleshooting, quality monitoring, etc.) as well as forservice providers and other administrative domains

Network management Provides an overall view of the network resources, end to end, based on

the information from below about network elements and links

Coordinates activities at the network level and supports the functionalrequirements of service management

Element management Provides a view of individual network elements or groupings into

subnetworks Element managers (OSs) are responsible for subsets of allnetwork elements, from the perspective of TMN-manageable informationsuch as element data, event logs, and activity Mediation devices belong

in this layer, communicating with OSs via the Q3 interface

Network elements Presents the TMN-manageable information of individual network resources

(e.g., switches, routers, Q-adapters)

there may be no or a limited choice of carriers Other

countries have begun to deregulate, so that multiple

car-riers compete for subscriber business, often creating more

choices in technology and services, as well as better

pric-ing In either case, service availability may differ from one

location to another: DSL access might be easily obtained

in greater Boston, but not be available in a rural area; T1

service might be acquired readily in Singapore but

per-haps not everywhere in New York City

Do not make the mistake, however, of assuming thatmore highly developed areas or countries always have bet-

ter service options than developing ones An established

metropolitan area experiencing rapid growth in demand

for telecommunications may be less able to adapt or

ex-pand existing cable and switching capacity to meet new

orders than a new suburban business park where there

is plenty of room to install new cables and switches to

provide higher-speed services Similarly, developing

coun-tries that have very little investment in old infrastructure

may be able to skip generations of technology, installing

the latest where there was previously none Economics

tend to dictate that this does not happen uniformly, but

rather emphasizes locations more likely to provide rapid

payback for the particular technology investment (e.g.,

urban rather than rural, business rather than residential,

and high-density population areas) Often it is the access

infrastructure that lags behind, because the upgrade costs

cannot be amortized across multiple subscribers the way

backbone investments can This is especially true where

the end-points are individuals with more limited budgets

than business or organizational enterprises

SWITCHING, ROUTING,

AND SIGNALING

Network Architecture

MANs and WANs are usually divided into three logical

seg-ments (Figure 5) Access typically includes the customer

premises equipment (CPE) located in a subscriber’s ing or ofﬁce area and the link that physically connectsfrom there to the service provider’s point of presence This

build-link is connected to a device at the edge of the service

provider’s network, and the edge device is connected to

devices that compose the core (also called the backbone)

of the service provider’s network Different technologiesare often used in the access and core portions, with theedge required to translate between the two The ratio ofthe aggregate input capacity from all subscriber connec-tions to an edge device to the output capacity from theedge into the core describes the degree of oversubscrip-tion For example, if the sum of all access links is 200 Mbpsand the core link is 100 Mbps, then the oversubscriptionratio is 2:1 A ratio less than or equal to 1 is called non-blocking; the network performance for values greater than

1 depends on the bursty nature of data trafﬁc to minimizethe probability that trafﬁc will be delayed excessively (bybuffering) or discarded (when buffers become full)

MAN or WAN

B A

Trang 33

782

Some form of packet switching is employed in most

core data networks today to move trafﬁc through the

net-work Various techniques are used to meet customer

ex-pectations for reliable, timely, and effective delivery of

trafﬁc to its intended destination For example, a virtual

circuit can be established to approximate the service

char-acteristics available in a circuit-switching environment,

such as guaranteed delivery of packets in the same order

as they were transmitted However, virtual circuits do not

dedicate resources along the path from source to

desti-nation, so the network must have sufﬁcient intelligence

to keep trafﬁc moving well enough to meet subscriber

expectations

Choosing the best place to put network intelligence

(at the edge or in the core) has been a subject of

ongo-ing discussion among service providers for many years

For example, packets could be examined and labeled at

the edge in a way that forwarding decisions in the core

are made by simple, high-speed switches This approach

would provide very fast core transit, but the cost of many

intelligent edge devices could be high and core switches

must still be smart enough to accommodate and adapt to

changes in network topology or conditions An alternative

approach makes the edge devices quite simple and

inex-pensive, while requiring the core to have the intelligence

and take the time to understand the characteristics and

accommodate the transport needs of the trafﬁc

Switching Technologies

In the OSI Reference Model, switching takes place at

Layer 2, the Data Link Layer However, much of the WAN

switching technology for data networking was developed

from experience with X.25, an ITU-T packet-switching

protocol standard developed in the 1970s to support

pub-lic data networking, and still in use today X.25 creates

a connection-oriented network out of packet-switching

resources by employing virtual circuits to handle packet

ﬂow, keeping the data link layer simpler but requiring

cir-cuits to be established before packets can be sent Circir-cuits

that are prebuilt from a source to a particular

destina-tion and then left in place are permanent virtual circuits

(PVCs), while switched virtual circuits (SVCs) are

estab-lished only on demand SVCs are like dial-up connections,

requiring circuit establishment to the speciﬁed

destina-tion for each call before trafﬁc can ﬂow

X.25

X.25 is a three-layer protocol suite (Figure 6) The OSI

net-work layer equivalent is the packet-layer protocol (PLP),

which has operational modes for call establishment, data

transfer, and call termination, plus idle and restarting

op-erations These functions are implemented through the

Flag(framedelimiter)

Address(command

or responseindicator)

Control(frame type,sequence #,function)

DATA

FCS(frame checksequence)

Flag(framedelimiter)

Figure 7: LAPB frame format.

PLP

LAPB

X.21bis,EIA/TIA-232,EIA/TIA-449,EIA-530,G.703X.25

Figure 6: X.25 protocol suite.

services of a data link protocol called the Link Access cedure, Balanced (LAPB), which is responsible for fram-ing data and control commands and for basic error check-ing through use of a frame-check sequence (Figure 7).During call establishment, the PLP sets up SVCs usingX.121 standard addresses These include the internationaldata number (IDN), made up of a four-digit data networkidentiﬁcation code (DNIC, to specify the packet-switchingnetwork containing the destination device) and a nationalterminal number (NTN) consisting of as many as 10 digits.The NTN speciﬁes the exact destination device to whichpackets will be forwarded

Pro-Frame Relay

Frame relay is the most widely used packet-switchingWAN technology going into the 21st century As WAN fa-cilities became more reliable during the 1980s, interestrose in streamlining X.25 to improve performance and ef-ficiency Frame relay (FR) was thus designed as a Layer-2protocol suite, with work begun by CCITT in 1984 How-ever, it was not until 1991, when several major telecommu-nication equipment manufacturers formed a consortiumcalled the Frame Relay Forum (FRF) to work out interop-erability issues and foster acceptance, that frame relay be-gan to be more widely deployed In particular, FRF definedextensions to the CCITT work called the local manage-ment interface (LMI) to improve service providers’ abili-ties to provision and manage frame relay services.Frame relay networks (Figure 8) are based on the con-cepts of data-terminal equipment (DTE) and data circuit-terminating equipment (DCE) first defined by X.25 Sub-scriber hosts, servers, workstations, personal computers,

Trang 34

FR network

DTEDTE

DCEPSE

Figure 8: Frame relay network elements.

and terminals connected to a frame relay network are all

considered to be DTE The DCE is usually built as an

in-terface into the service provider’s packet-switching

equip-ment (PSE) rather than just being a modem at the edge

of an X.25 network Frame relay also uses virtual circuits

to create a bidirectional communication path between a

pair of DTE devices FR virtual circuits are distinguished

by data link connection identiﬁers (DLCIs), which may

have local signiﬁcance only, meaning that each end of a

single virtual circuit could have a different DLCI assigned

by the FR service provider

The format for frame relay data combines LAPB’s dress and control ﬁelds into one 16-bit address ﬁeld that

ad-contains the 10-bit DLCI, an extended addressing

indica-tor bit (for future use), a command/response bit that is

not used, and congestion control information To

mini-mize network overhead, the congestion control

mecha-nisms are quite simple:

one forward-explicit congestion notiﬁcation (FECN) bit

that tells a DTE that congestion occurred along the

path in the direction from the source to the

destina-tion;

one backward-explicit congestion notiﬁcation (BECN) bit

that tells a DTE that congestion occurred along the

path in the direction opposite to the transmission from

the source to the destination; andone discard-eligibility (DE) bit to indicate whether this is

a lower priority frame that may be discarded beforeothers in a congested situation

As a packet-switching technology, frame relay also pends on the bursty nature of data trafﬁc to make efﬁ-

de-cient use of its transmission facilities for larger numbers

of subscribers than could be served with physically

ded-icated connections The ability to overbook resources is

fundamental to the service provider’s business model, as

well as being a beneﬁt to subscribers, who may be able to

insert trafﬁc occasionally at a higher rate than nominal

for their access link (called bursting)

Integrated Services Digital Network (ISDN)

Integrated services digital network (ISDN) is a set oftelecommunication standards first developed from theperspective of telephony networks to accommodate multi-ple types of traffic such as voice, fax, data, alarm systems,and video, all in digital format, over a single network Thegoal was to develop standard interfaces, both for accessand within the network, that would allow all types of dig-ital traffic to be transported end to end, reliably, and in

a timely fashion according to the needs of its tion The best-known elements of ISDN are the user in-terface deﬁnitions for connecting subscriber equipment

applica-to the network: the primary rate interface (PRI), intended

to replace T1 and E1 services, and the basic rate interface(BRI), designed with multiple channels for voice or datatrafﬁc from an individual subscriber

Asynchronous Transfer Mode (ATM)

Asynchronous transfer mode (ATM) was selected as theOSI Layer-2 transport technology for broadband ISDN(B-ISDN) in 1988 It was designed to be useful acrossWAN, MAN, and LAN communications, as well as to ac-commodate multiple types of trafﬁc in a single network(voice, data, video, etc.) and scale for very large networks.Other design goals included the abilities to support avariety of media types (e.g., ﬁber and copper), leveragesignaling standards already developed for other tech-nologies, promote low-cost switching implementations(potentially one-tenth the cost of routing), adapt readily tofuture network requirements, and enable new, large-scaleapplications The challenges inherent in such a diverse set

of goals brought together designers from many differentbackgrounds, and resulted in a rather complex architec-ture (Figure 9)

Basically, ATM is a connection-oriented, switching technology that uses ﬁxed-length packets called

packet-cells The 53-byte cell size (5 bytes of header information

and 48 bytes for the payload) was chosen as a compromisebetween the optimal size for voice trafﬁc and the largersize preferred for data applications The ﬁxed size andformat mean that very fast switches can be built across

a broad range of transmission rates, from megabits togigabits per second and beyond ATM interfaces are of-ten characterized by their equivalent optical-carrier lev-els whether they employ ﬁber or copper media The most

Application Presentation Session Transport Network Data Link Physical

OSI Reference Model

ATM Adaptation Layer ATM Layer Physical Layer

Higher Layers Higher Layers

Management Plane Control Plane Control Plane

Figure 9: ATM reference model.

Trang 35

784

popular interfaces tend to be OC-3, OC-12, and OC-48

(Table 2), according to their application in WANs, MANs,

or LANs

An important feature of ATM is the deﬁnition of service

categories for trafﬁc management:

Constant Bit Rate (CBR) was designed to emulate

tradi-tional circuit-switched connections It is characterized

by minimum and maximum cell rates speciﬁed at the

same, constant value Typical CBR applications include

uncompressed voice and video, or television, all

sensi-tive to both delay and delay variation

Variable Bit Rate real-time (VBR-rt) and non-real-time

(VBR-nrt) are characterized by speciﬁed minimum and

maximum cell rates, much like frame relay Typical

ap-plications include compressed voice or video, and

mul-timedia e-mail VBR-rt handles applications sensitive

to delay variation, while VBR-nrt is suitable for bursty

trafﬁc

Unspeciﬁed Bit Rate (UBR) handles trafﬁc on a best-effort

basis, without guaranteeing delivery or any

particu-lar rate This is used to carry data (such as

store-and-forward e-mail) not sensitive to delay In a highly

con-gested network situation, UBR cells may be discarded

so that the network can meet its trafﬁc contracts for

the other types

Available Bit Rate (ABR) is characterized by a guaranteed

minimum cell rate, but may offer additional bandwidth

when network resources are available Rate-based ﬂow

control provides the adjustment mechanism When it

is offered, ABR is often preferred for data trafﬁc

ATM’s service categories are crucial to meeting user

de-mands for quality of service (QoS), which generally means

guaranteed, timely delivery of trafﬁc to match the needs

of particular applications An ATM end system will

re-quest a particular level of service for trafﬁc entering the

network, forming a trafﬁc contract with the network The

ATM switches throughout the network are responsible for

meeting the terms of the contract by trafﬁc shaping (using

queues to smooth out traffic flow) and by traffic policing

to enforce the limits of the contract The capabilities of

ATM to provide QoS end to end across a network for

mul-tiple types of trafﬁc simultaneously are the most

sophis-ticated to date, and distinguish ATM from other

packet-switching technologies Its suitability for LAN, MAN, and

WAN applications makes ATM especially popular with

service providers, because they can use one technology

throughout to manage their own infrastructure and to

support a large variety of service offerings to their

cus-tomers

Fiber Distributed Data Interface (FDDI)

Fiber distributed data interface (FDDI) was developed

by the American National Standards Institute (ANSI) in

the mid-1980s as a 100-Mbps standard for ring-based

networks that had outgrown their capacity to handle

high-speed workstations or provide nonblocking

back-bone connections It was designed originally to expand

the typical LAN environment, using a timed token access

method for sharing bandwidth at OSI Layer 2 and

read-start of data flow end of data flow

(head of Bus A)

(head of Bus B) Bus B

Phys-a highly reliPhys-able network with deterministic, predictPhys-ableperformance FDDI was the ﬁrst LAN technology suitablefor distances beyond a building or small campus, and wasused by some to cover the geographic scope of a MAN

Distributed Queue Dual Bus (DQDB)

Distributed queue dual bus (DQDB) was also developedduring the 1980s, speciﬁcally to address the needs ofmetropolitan area networking for integrated services such

as voice, data, and video The IEEE 802.6 working groupfinally ratified it as a Layer-2 standard in 1990 As itsname suggests, DQDB specifies a network topology of twounidirectional buses that are able to interconnect multi-ple nodes (Figure 10) The supporting physical layer forDQDB initially offered various transmission interfacesand speeds from DS3 (45 Mbps) to STM-1 (155 Mbps).The idea of DQDB was that multiple subnetworks could beinterconnected to form a MAN, with the goal of support-ing connectionless and connection-oriented data trans-fers, along with isochronous traffic, sharing the total com-munication capacity available

DQDB may be most familiar as the basis for

deﬁni-tion of switched multimegabit data service (SMDS)

packet-switched public data networks SMDS was designed byBell Communications Research (Bellcore) for high-speed,connectionless delivery of data beyond the LAN Its vari-able frame size up to 9188 octets is large enough to en-compass as payload any of the popular LAN technologyframes (i.e., Ethernet, token ring, and FDDI) The SMDSinterface protocol was defined as a three-level protocolthat specifies how subscribers access the network As aservice, SMDS was intended to be independent from anyunderlying transport technology Thus it was first offered

at DS1 to DS3 access speeds, with a goal of increasinglater to OC-3

Ethernet

Ethernet became the dominant LAN technology in the ter 1990s, as extensions from the original 10 Mbps weredeﬁned for 100 Mbps, then 1,000 Mbps ( = 1 Gbps), and be-came widely deployed In the same time period, new com-munication companies with no telephony heritage beganlaying optical ﬁber, and leasing capacity for short-haul(i.e., MAN) or long-haul (i.e., WAN) connections rather

Trang 36

lat-DeNoia WL040/Bidgolio-Vol I WL040-Sample.cls June 20, 2003 17:57 Char Count= 0

router

Figure 11: Routing within and between autonomous domains.

than selling services, as was typical in public networks

This meant that customers could specify the technology

used to put bits on the medium rather than subscribing

only to speciﬁc services offered by providers As advances

in optics and use of switching allowed Ethernet to cover

even greater distances, the geographic limits that

distin-guished LAN from MAN technologies began to disappear

In fact, new providers sprang up offering Ethernet

con-nectivity from the business doorstep to other locations

across town or beyond The great competitive question

was whether Ethernet MANs could be made as reliable

and fault-tolerant as more traditional MAN/WAN

tech-nologies built over SONET

Resilient Packet Ring (RPR)

Resilient packet ring (RPR) is an effort begun by the IEEE

802.17 working group in late 2000 to design a high-speed

access protocol combining familiar Ethernet interfaces

with the fault-tolerance and rapid restoration capability

of ring-based MAN technologies like SONET RPR deﬁnes

a new medium access control (MAC sublayer of OSI Layer

2) protocol that extends Ethernet framing from the LAN

into the MAN/WAN environment As seen by the RPR

Alliance (an industry consortium designed to promote

adoption of RPR), this approach combines the

cost-effective scalability of Ethernet access interfaces with a

MAN that can be optimized for rapidly increasing

vol-umes of data trafﬁc Because it focuses on the MAC

sublayer, RPR is independent of the underlying Layer-1

technology, making it suitable to run over much of the

MAN infrastructure already in place

Routing Technologies

In the OSI Reference Model, routing takes place at Layer

3, the Network Layer Essentially routing consists of three

major functions: maintaining information about the

net-work environment, ﬁnding a path through the netnet-work

from particular sources to destinations, and forwarding

packets at each relay point The Internet protocol (IP) is

the dominant method of interconnecting packet-switched

networks (i.e., for internetworking) at Layer 3 It provides

connectionless network services (CLNS), with no antee of delivery or packet ordering, and is widely usedtoday for private and public LANs, MANs, and WANs, in-cluding the Internet IP is primarily concerned with theformat for packets (also called datagrams), the deﬁni-tion and structure of addresses, a packet-forwarding al-gorithm, and the mechanisms for exchanging informationabout conditions in and control of the network

guar-Routing responsibility in an internetwork is divided tween intradomain or interior routing protocols (IRPs)and interdomain or exterior routing protocols (ERPs) asshown in Figure 11 IRPs are used for internetworks thatbelong to a single administrative authority, such as an en-terprise LAN, a single service provider’s MAN, or a privateWAN ERPs are used when routers tie together networksbelonging to multiple independent authorities, as in theInternet These protocols differ in how much informa-tion is kept about the state of the network and how rout-ing updates are performed using the mechanisms deﬁned

be-by IP

IP Version 4 (IPv4)

IP version 4 (IPv4) was deﬁned by the Internet neering Task Force (IETF) for the original ARPAnet andpublished as (Request for Comments) RFC 791 in 1981

Engi-It specifies that each interface capable of originating orreceiving internetwork traffic be identified by a unique32-bit address consisting of an ordered pair containing anetwork identifier (net ID) and a host/interface identifier(host ID) Three primary classes of network addresses (A,

B, and C) were designed to promote efficient routing, withadditional classes defined for special or future uses (Fig-ure 12) Although the Internet is not centrally managed, itwas necessary to establish a single authority to assign ad-dresses so that there would be no duplicates or conflicts

As the Internet grew through the 1980s, a number oflimitations in the design of IPv4 became apparent The al-location of addresses, especially classes A and B, tended to

be wasteful For example, a single class B address assigned

to one organization accommodates one network with over64,000 IP interfaces—much larger than is practical or

Trang 37

786

001011011111

Host_IDNet_ID

Multicast addressNet_ID

24

31bit position

A B C D E

Figure 12: IPv4 addressing format.

needed for most, meaning that a lot of address space can

be wasted On the other hand, a single class C address

ac-commodates only 255 interfaces, which is too small for

most organizations, requiring them to have more than

1 From a routing perspective, the two-level hierarchical

address structure means that routers need to keep track

of over 16 million net IDs just for class C networks, as

well as calculate paths through the Internet to each one

A number of schemes were developed to solve some of the

addressing and router problems (subnet masking,

class-less interdomain routing or CIDR), but those were not

the only issues Rising interest in using the Internet to

carry voice, video, multimedia application, and

commer-cial transaction trafﬁc increased the demand for security

and quality of service support, neither of which were built

into IPv4 Consequently, the IETF began work on a new

version, IP-ng, to handle the next generation

IP Version 6 (IPv6)

IP version 6 (IPv6) represents that next generation of

Net-work Layer services It extends the addressing space from

32 to 128 bits, simpliﬁes the packet header and allows for

future expansion, and adds new capabilities to label ﬂows

of packets (same source to a single destination), to assign

packets priority in support of QoS handling, and to

pro-vide authentication and security Several of these features

(CIDR, DiffServ, and IPsec) were designed so they could

be added onto IPv4 In fact, such retroﬁtting solved IPv4

problems well enough in the late 1990s that people

be-gan to question whether a move to IPv6 was necessary

Upgrading the large numbers of routers involved with

In-ternet trafﬁc would be expensive, time-consuming, and

require careful coordination Transition strategies and

mechanisms would likely be needed over a considerable

period of time Unfortunately, retroﬁts cannot do much

about the size of IPv4 addresses Sufﬁcient growth in the

numbers and types of devices people want to connect

to or through the Internet (handheld devices, household

appliances, automobile systems, etc.) and international

pressure from countries without enough addresses will

eventually make IPv4 addressing inadequate The only

question seems to be when

Border Gateway Protocol (BGP)

Border gateway protocol (BGP) is the exterior routing

pro-tocol used by independent or autonomous systems (ASs)

to exchange routing information throughout the Internet

Published in 1995 as RFC 1771, it deﬁnes procedures to

establish neighbor relationships, and to test the

reachabil-ity of neighbors and other networks A router at the edge

of an AS uses BGP to work with adjacent (i.e., directlyconnected) routers in other ASs Only after two routers(one in each AS) have agreed to become neighbors canthey exchange routing information or relay trafﬁc for eachother’s AS Unlike IRPs, which use the services of IP toaccomplish their communication, BGP uses the reliabletransport services of TCP (transmission control protocol,running over IP) In this way, BGP can be simpler because

it depends on the error control functions of TCP, and itsmessages are not limited in size by the constraints of an

IP datagram

BGP is purposefully designed to allow an AS to trol what detail of internal information is made visibleoutside the AS (aggregating routes using CIDR, for ex-ample) Typically each BGP router screens potential rout-ing updates or reachability advertisements against a con-figuration file that specifies what type of information it

con-is allowed to send to each particular neighbor Thcon-is proach promotes policy-based routing, but at the expense

ap-of needing to calculate paths from incomplete detail aboutthe network topology Thus BGP will not always choosethe optimal path across an internetwork to reach a par-ticular destination It does, however, allow a country orcompany constituting an AS to make appropriate political

or business decisions about when and where to route itstrafﬁc

Questions about the scalability of BGP have beenraised in light of predictions for continued substantialgrowth in Internet trafﬁc, and particularly as more orga-nizations consider deploying delay-sensitive applicationsover the Internet (e.g., voice, video, conferencing) Intelli-gent route control, virtual routing, and new approaches totrafﬁc engineering are among the options being explored

to solve performance problems before they become ous impediments to effective use of the Internet

seri-Multiprotocol Label Switching (MPLS)

Multiprotocol label switching (MPLS) has been designed

by the IETF to improve the performance of routed works by layering a connection-oriented framework over

net-an IP-based internetwork MPLS requires edge routers toassign labels to trafﬁc entering the network so that in-termediate routers (called label-switching routers, LSRs)can make forwarding decisions quickly, choosing the ap-propriate output port according to the packet’s label andrewriting that label (which is intended to have local

Trang 38

signiﬁcance only) as necessary MPLS represents a

sig-niﬁcant shortcut from the usual IP approach, where every

relay node must look deeply into the packet header, search

a routing table for the best match, and then select the best

next hop toward the packet’s destination All packets with

the same MPLS label will follow the same route through

the network In fact, MPLS is designed so that it can

ex-plicitly and ﬂexibly allocate network resources to meet

particular objectives such as assigning the fastest routes

for delay-sensitive packet ﬂows, underutilized routes to

balance trafﬁc better, or multiple routes between the same

end-points for ﬂows with different requirements This is

called trafﬁc engineering and serves as the foundation for

both optimizing performance and supporting QoS

guar-antees

Nothing about the MPLS design limits its use to the

IP environment; it can work with suitably equipped ATM

and frame relay routers as well In fact, it can coexist with

legacy routers not yet updated with MPLS capabilities,

and it can be used in an internetwork that contains a mix

of IP, ATM, and frame relay Another powerful feature is

the ability to stack labels on a last-in-ﬁrst-out basis, with

labels added or removed from the stack by each LSR as

appropriate This allows multiple label-switched paths to

be aggregated into a tunnel over the common portion of

their route for optimal switching and transport MPLS

is also a convenient mechanism to support virtual

pri-vate networks, especially when multiple Internet service

providers are involved along the path from one end to the

other

Signaling and Interworking

Connection-oriented networks require speciﬁc

mecha-nisms for establishing a circuit (physical or virtual) prior

to trafﬁc ﬂow, and for terminating the circuit

after-ward In the circuit-switched telephony environment, call

setup and termination are part of a well-developed set of

telecommunication system control functions referred to

as signaling MANs and WANs that were built for voice

included signaling as an integral part of their designs,

because resources were dedicated to each call as it was

established and needed to be released after call

comple-tion

The ITU-T began developing standards for digitaltelecommunication signaling in the mid-1960s; these have

evolved into common channel interofﬁce signaling system

7 (CCIS7, known in the United States as Signaling System

7, or just SS7 for short), currently in use around the world.

SS7 is an out-of-band mechanism, meaning that its

mes-sages do not travel across the same network resources as

the conversations it was designed to establish and control

In fact, SS7 uses packet switching to deliver control

mes-sages and exchange data, not just for call setup, but also

for special features such as looking up a toll-free number

in a database to ﬁnd out its real destination address, call

tracing, and credit card approvals Out-of-band delivery

of the messages allows SS7 to be very fast in setting up

calls, to avoid any congestion in the transport network,

and also to provide signaling any time during a call

The SS7 network has a number of elements that worktogether to accomplish its functions (Figure 13):

STP

SCPSSP

SSP

datatransportnetwork

SS7network

DB1

1

Figure 13: SS7 network elements.

Signal switching points (SSPs) are the network edge vices responsible for setting up, switching, and termi-nating calls on behalf of connected subscriber devices,and thus insert user trafﬁc into, and remove it from,the service provider’s backbone network

de-Signal transfer points (STPs) are packet switches sible for getting SS7 messages routed through thecontrol network

respon-Signal control points (SCPs) house the databases thatsupport advanced call processing

In packet-switched MANs and WANs, signaling hadbeen associated primarily with establishing and tearingdown SVCs that required no further control during thedata transfer phase With a rising interest in multime-dia communications (e.g., video, and especially voice overIP) however, the ITU-T quickly recognized a need for ad-

ditional capabilities Their H 323 recommendations

en-compass an entire suite of protocols that cover all aspects

of getting real-time audio and video signals into packetform, signaling for call control, and negotiation to ensurecompatibility among sources, destinations, and the net-work H.323 takes advantage of prior ITU work (such asISDN’s Q.931 signaling protocol) and deﬁnes four majorelements (Figure 14):

Terminals are the end-user devices that originate and

re-ceive multimedia trafﬁc

Gateways primarily handle protocol conversions for

par-ticipating non-H.323 terminals, as would be found inthe public switched telephone network (PSTN)

Gatekeepers are responsible for address translation, call

control services, and bandwidth management

Multipoint Control Units (MCUs) provide

multiconferenc-ing among three or more terminals and gateways

The IETF took a simpler approach to signaling with the

session initiation protocol (SIP), which was designed as a

lightweight protocol simply to initiate sessions betweenusers SIP borrows a great deal from the hypertext trans-fer protocol (HTTP), using many of the same header ﬁelds,

Trang 39

Figure 14: H.323 network elements.

encoding rules, error codes, and authentication methods

to exchange text messages Like H.323, SIP assumes that

the end-point devices (i.e., terminals) are intelligent,

run-ning software known as the user agent The agent has two

components: User Agent Client, which is responsible for

initiating all outgoing calls, and the User Agent Server,

which answers incoming calls In the network itself, SIP

provides support with three types of server:

Registration servers keep track of where all users are

lo-cated

Proxy servers receive requests and forward them along to

the next appropriate hop in the network

Redirect servers also receive requests and determine the

next hop, but rather than forwarding the request, they

return the next-hop server address to the requester

An alternative approach to multimedia

communica-tion control developed by the IETF is called the media

gateway control protocol (MGCP) It is quite different

from H.323 and SIP because it assumes that the

end-user devices are not very intelligent Consequently MGCP

takes a central server approach to communication

coor-dination and control Two elements are deﬁned: the

Me-dia Gateway Controller (also known as the call agent),

which provides the central intelligence and controls all

of the Media Gateways, which perform a variety of

in-terface functions such as with the PSTN, residential

de-vices, and business private branch exchanges (PBXs)

MGCP deﬁnes the communication that takes place

be-tween the call agent and the Gateways that execute its

commands

In practice H.323, SIP, and MGCP will likely coexist to

support multimedia communication in the Internet

en-vironment because each has advantages for speciﬁc

ap-plications or coverage MGCP is particularly useful to

MAN/WAN service providers with large installed bases of

unintelligent end-point devices, and its gateway approach

allows for tailored interfaces to each different

underly-ing technology The simplicity of SIP is more attractive

to enterprise networks designed primarily for data trafﬁc

with smaller requirements for supporting voice and video

Finally, H.323 is the most mature and most

comprehen-sive As usual in the telecommunication industry, vendorsupport and suitability to customer business models arelikely to determine which, if any, one approach becomesdominant

PROVIDERS AND SERVICES

Carriers and Service Providers

The public provision of telecommunication services tosubscribers for a fee has a history of being government-regulated in most parts of the world (the term “commoncarrier,” for example, dates back to public transportationfor people, ﬁrst by stagecoach, then by trains, buses, etc.).Regulation was required because access to telecommuni-cation services depended on cabling that was run fromsubscriber premises (residential or business) across pub-lic property (e.g., along roads) to a provider’s central of-ﬁce as a service point Governments could also imposestandards to ensure that services offered by providers indifferent locations would be compatible enough to inter-operate In some countries, infrastructure was built andservices operated by the government itself (e.g., PTTs thatprovided postal, telegraph, and telephone services nation-wide) In the United States, telephone industry regulationwas divided between LECs whose cabling and local ser-vices go to individual premises, and IXCs who providedthe interconnection (i.e., long-distance services) betweenLECs

The Internet as a means of public data tion has grown up rather differently, driven largely by theU.S regulatory environment, where telecommunicationcompanies were prohibited from providing data services.Consequently, a new type of company called an Internetservice provider (ISP) was born Data would move from asubscriber’s premises, across cables belonging to an LEC,

communica-to ISP equipment in a point of presence, where it wastransferred onto Internet resources The subscriber thushad to be a customer of both the LEC and the ISP unless

a private link could be installed directly to the ISP’s POP.The Internet connections from one ISP location to an-other are most often lines leased from an IXC As telecom-munication services have been increasingly deregulatedworld-wide, the distinctions among voice and data serviceproviders have become blurred

Trang 40

It is important to remember that “the Internet” is notreally a single entity, but rather an interconnected set of

autonomous networks whose owners have agreed to

co-operate and use a common set of standards to ensure

in-teroperability Peering is a form of interconnection where

ISPs agree to exchange trafﬁc for their respective

cus-tomers, based on a speciﬁc set of business terms Peering

points are where the networks actually connect to effect

this exchange The number and location of peering points

and partners is decided by each ISP according to customer

demand and its own business criteria Subscribers may

need to be aware of these agreements in order to

under-stand fully the performance they can expect end to end

across the Internet

Just as the background and emphasis of traditionalvoice and traditional data service providers differ, so do

their business models and their choices of technology

Some offer only transport for trafﬁc, either between

sub-scriber sites or to the Internet Others offer access to

ap-plications or management services Local

telecommuni-cation carriers tend to offer MAN services over an ATM

and SONET infrastructure, while data providers would

be more likely to offer IP services or simply Ethernet

ac-cess and transport Cable television and wireless service

providers also offer access services according to the

char-acteristics of their infrastructure technologies The

op-tions available will likely continue to grow as technology

progresses

Class of Service, Quality of Service

As interest in carrying multimedia or multiple-service

trafﬁc (i.e., voice, data, video) over MANs and WANs

has grown, managing the trafﬁc to provide

perfor-mance appropriate to each application has become more

important Quality of service techniques are expected to

guarantee performance and delivery, usually in terms of

bandwidth allocation, timeliness of delivery, and minimal

variation in delay (e.g., ATM service categories) Class of

service (CoS) techniques do not make such guarantees,

but rather attempt to meet user requests on a best-effort

basis Typically CoS works by grouping together trafﬁc

with similar requirements (e.g., voice or streaming video)

and using a priority queuing system so that switches and

routers forward the trafﬁc accordingly Connectionless

network services such as IP offer CoS trafﬁc management,

while connection-oriented services such as ATM provide

QoS

QoS cannot really be guaranteed unless it is availableall the way from end to end of the connection This creates

a challenge for MAN and WAN environments where

mul-tiple technologies from one or more service providers may

be involved in delivering user trafﬁc, and especially when

the trafﬁc originates or terminates in a LAN of yet another

different technology Several groups are involved in

devel-oping standard techniques for CoS and QoS The problem

is making sure that appropriate translation mechanisms

can carry user application requirements across network

and SP boundaries:

IEEE 802.1p is a Layer-2 tagging mechanism to specify

priority using 3 bits in the Layer-2 frame header

IETF’s differentiated services (DiffServ) indicates howpackets are to be forwarded using per-hop behavior(PHB) queuing, or discarded if there is not sufficientbandwidth to meet performance requirements.ATM traffic management defines service categories andtraffic classes

Virtual Private Networks

A virtual private network (VPN) is a special service thatamounts to establishing a closed user group capabil-ity over a shared or public network infrastructure Thismeans that access is restricted to authorized users only,privacy of data content is assured, traffic belonging withinthe VPN does not get out or become visible to unautho-rized users, and outside traffic does not get in VPNs arebecoming a very attractive way for organizations to re-duce the cost of private WANs while improving the secu-rity for traffic that travels over public networks Wherehigh-speed MAN and WAN services are available, long-distance performance can even be kept reasonably close

to what the remote users would experience if they weredirectly connected to the LAN VPNs may also be built tosend trafﬁc across the Internet, with one or more SPs pro-viding the access links between the Internet and variousgeographically dispersed customer sites Internet VPNscan be signiﬁcantly less expensive than the private lines

or networks they replace

Management

The OSI model for network management encompassesfive functional areas: configuration management, perfor-mance management, fault management, accounting man-agement, and security management A MAN or WAN ser-vice provider must cover these from the perspective ofboth operating the entire network effectively and balanc-ing the needs and expectations of paying customers whocould always choose to take their business elsewhere Op-eration must be reliable, there must be sufficient capacity

to meet trafﬁc needs and performance expectations, andprivacy must be maintained not only for the content ofthe trafﬁc carried but also for data about the customers

At the same time, subscribers typically want the ability

to manage the performance and ﬂow of their own trafﬁcthrough their allotment of SP resources SP operation sys-tems must be capable and sophisticated to meet all theserequirements

A primary mechanism used to establish and manage

expectations between customers and providers is the

ser-vice level agreement (SLA) SLAs are the deﬁning

docu-ments (contracts) that spell out what services and levels

of support will be provided to the customer at a speciﬁedprice Successful SLAs are built on a solid, shared un-derstanding of business priorities and service impact, forboth the service user and the service provider Detail aboutroles and responsibilities, metrics and reporting, addedcost for incremental services or enhancements, escala-tion procedures, and change management are just some

of what should be covered in an SLA Many customersalso build in penalties in case the provider fails to de-liver services at the level speciﬁed in the SLA This may be

Tiêu đề	Web Search Technology
Tác giả	Meng, Yuwono, Lee, Callan, Yu
Trường học	University of Information Technology
Chuyên ngành	Web Search Technology
Thể loại	Thesis
Năm xuất bản	2002
Thành phố	Ho Chi Minh City

Định dạng
Số trang	98
Dung lượng	1,95 MB