wiley http essentials protocols for secure scaleable web sites phần 2 docx

2.3.2 Status – HEAD The HEAD operation is just like a GET operation, except that the server does not return the actual object requested.. When a server receives a TRACE, it responds 1 O

Trang 1

page the user has identified a local file By clicking on the

Upload button, the user asks the browser to send a PUT

re-quest to the server

2.2.4 File Deletion – DELETE

With GET and PUT operations, http becomes a serviceable

protocol for simple file transfers The DELETE operation

completes this function by giving clients a way to delete

ob-jects from servers The message exchange contains no

sur-prises As figure 2.13 shows, the client sends a DELETE

message along with the uri of the object the server should

remove The server responds with a status code and,

option-ally, more data for the client

Figure 2.12

The PUT request may be used to upload a file to a server In this example the user wants to store the indicated file on the server

Trang 2

2.3 Behind the Scenes

The basic http operations generally occur as a direct result

of end-user actions Those four operations are not the only ones the protocol defines, however Three additional opera-tions, OPTIONS, HEAD, and TRACE, frequently take place be-hind the scenes Clients use them to communicate with servers not so much to perform user actions but to prepare for or diagnose problems with the basic operations

Although this section does not discuss it further, the http specification also reserves the name for another operation, CONNECT The standard does not define how CONNECT works, except to indicate that it is intended to support tunneling (See section 2.4.3.) Future extensions to http may define CONNECT in more detail

2.3.1 Capabilities – OPTIONS

Clients can use an OPTIONS message to discover what bilities a server supports The exchange is the standard re-quest and response, as figure 2.14 illustrates If the client includes a uri, the server responds with the options relevant

capa-to that object If the client sends an asterisk (*) as the uri, the server returns the general options that apply to all objects

it maintains

A client might use the OPTIONS message to determine the version of http that the server supports or, in the case of a specific uri, which encoding methods the server can provide for the object Such information would let the client adjust

1 DELETE URI

2

200 OK + Data

Figure 2.13 The DELETE operation lets a client

remove an object from a server The

URI identifies the object to delete.

Trang 3

how it interacts with the server or how it actually requests a

specific object

2.3.2 Status – HEAD

The HEAD operation is just like a GET operation, except that

the server does not return the actual object requested As

figure 2.15 shows, the server returns a status code but no data

(HEAD is short for “header,” as the server returns only message

headers in response.) Clients can use a HEAD message when

they want to verify that an object exists, but they don’t need

to actually retrieve the object Programs that verify links in

Web pages, for example, can use the HEAD message to ensure

that a link refers to a valid object without consuming the

network bandwidth and server resources that a full retrieval

would require Cache servers can also use the HEAD operation;

it gives them a way to see if an object has changed without

actually retrieving the full object

2.3.3 Path – TRACE

The TRACE message gives clients a way to check the network

path to a server When a server receives a TRACE, it responds

1 OPTIONS URI

2

200 OK + Options

Figure 2.14

Clients can use an OPTIONS request to ask about a particular object or about the server itself The server returns the options data in its response

Trang 4

simply by copying the TRACE message itself into the data for the response Figure 2.16 shows the simplest case

TRACE messages are more useful when multiple servers are involved in responding to a request An intermediate server, for example, may accept requests from clients but turn around and forward those requests onto additional servers (Proxies and cache servers, described in the next section, are examples of such intermediate servers.) When an interme-diate server is involved, TRACE works as in figure 2.17 The intermediate server modifies the request by inserting a Viaoption in the message This Via option is part of the message that arrives at the destination server, and it is copied into the data of the server’s response When the client receives the response, it can see the Via option in the data and identify any intermediate servers in the path Section 3.2.34 describes this process in more detail

Client

1 TRACE

4

200 OK + Message

Ultimate Server

2 TRACE

+ Via

3

200 OK + Message

Intermediate Server

Figure 2.16 Servers respond to TRACE requests by

echoing the request in their reply.

Figure 2.17 The TRACE request lets clients

discover the path their messages

follow through a network of

intermediate servers.

Trang 5

and a single server The http protocol defines more complex

interactions, however, that frequently involve multiple servers

cooperating on a client’s behalf In this section, we’ll look at

the different ways that multiple servers may be involved in a

communication exchange

2.4.1 Virtual Hosts

Of all the enhancements that http version 1.1 adds to

ver-sion 1.0, one of the smallest is direct support for virtual hosts

But although the protocol change is small, this feature is a

major benefit for the World Wide Web Virtual host support

addresses a key element of the Web’s architecture that the

designers of version 1.0 did not anticipate—Web hosting

providers

The popularity of the Internet has created a tremendous

de-mand for Web sites, as organizations ranging from

corpora-tions to individuals (and even pets!) establish a presence on

the Web In many cases, though, it is impractical or

ineffi-cient for the organization itself to own and operate the

serv-ers and network infrastructure a Web site requires To meet

this demand, traditional Internet Service Providers,

tele-communications carriers, and specialized service providers

can host Web sites on behalf of other organizations A

sig-nificant majority of sites on the Internet are modest and

re-quire little resources from the systems on which they run

Because they don’t require a dedicated server, for example,

most Web hosting providers actually run many separate Web

sites on a single server, as figure 2.18 illustrates

The problem facing a Web server hosting multiple Web sites

is simply stated: When a client requests a Web page, how

does the server know which site the client is attempting to

access? Consider a client request for the Web page

corre-sponding to http://www.company1.com/news.html The

cli-ent first resolves the host part, www.company1.com, to an ip

address Then, as figure 2.19 shows, it establishes a tcp

con-nection and sends the http command GET news.html to

Trang 6

that address Note, though, that the Web server does not participate in the dns resolution, so it doesn’t know which host the client intends to contact The Web server has no way of knowing whether “news.html” refers to com-pany1.com or company2.com

Prior to http 1.1, Web hosting providers had only two ways

to solve this problem They could require the Web sites to use unique uris for all their pages So if company1.com had a page named “news.html” on its site, company2.com could not use that same name within its pages In practice, Web host-ing providers implemented this solution by requiring a site identifier in all path names For example, instead of the straightforward uri “http://www.company1.com/news.html,” the company1.com Web site might use the more complicated

www.company1.com

Internet

Web Server www.company2.com

Domain Name System

1

Query www.company1.com

Virtual Hosts

Figure 2.19 Virtual hosts can make it difficult for

the Web server to determine which

Web site the client is trying to

access In this case the physical Web

server has no idea which Web

address the client requested

because it did not participate in the

DNS exchange that mapped the

host name to its IP address.

Figure 2.18 Virtual hosting lets many Web

addresses share the same Web server.

This configuration is typical in ISPs

that provide Web hosting for small

businesses and individuals.

Trang 7

“http://www.company1.com/company1.com/news.html.” As

an alternative, Web hosting providers could assign separate

ip addresses to each site on their servers The servers then

determine which site a client has requested by examining the

ip address to which the client connects Servers end up with

multiple ip addresses, and ip addresses are scarce resources

With version 1.1, http addresses the problem of virtual hosts

with a simple addition to the client’s request That addition

is the Host header, in which the client must place the host

name of the site it is requesting As figure 2.20 shows, the

server can easily determine the site to which a request

ap-plies, and it can return the appropriate resource

2.4.2 Redirection

While virtual host support allows a single server to support

multiple Web sites easily, redirection offers a way to support

a single site to use multiple servers Redirection lets a server

redirect a client to another uri for an object Figure 2.21

shows the process First the client requests an object from

the first Web server Instead of returning the requested

ob-ject, however, the server replies with a 301 Moved status

code The response also indicates a new uri for the object

The client recognizes this uri and, in step 3, reissues the

re-quest This time the GET succeeds, and the second server

re-turns the actual object

www.company1.com

Internet

Web Server www.company2.com

GET /news.html Host: www.company1.com

Figure 2.20

The Host feature in HTTP version 1.1 lets clients explicitly identify the Web site they are accessing, so the virtual hosting Web server can return the right content

Trang 8

Redirection is essential to the very dynamic Web ment It provides a convenient way to support revisions within a Web site, relocation of content, and even the change

environ-of a corporate identity

Note that the redirection does not have to specify a different host Frequently, in fact, redirection is used to inform the client of a new path for the resource on the same host Note also that there are other techniques for accomplishing the same effect The server can, for example, answer the original request by providing a JavaScript object that automatically directs the client to a new location

2.4.3 Proxies, Gateways, and Tunnels

Another way that http servers can cooperate with each other is by acting as proxies, gateways, or tunnels In each of these roles, the server that the client first contacts relays the request to a new server and then relays the second server’s response back to the client Figure 2.22 shows a proxy server

in operation

In the figure, the client first sends its http request directly

to the proxy server That server, however, cannot (or chooses not to) respond to the client immediately Instead, it re-issues the request to a second server, which the figure labels

A server redirects a client to tell the

client that the object it requested is

located elsewhere When, in step 2,

the client receives a 301 Moved

response, it looks for a new URI in the

response message and issues a new

GET request for that URI.

Trang 9

the “origin server” (so called because it is the origin of the

requested object) In the most basic case, the second GET has

a uri identical to that of the first; it’s simply sent to a new

server That server treats the second GET as if it had come

from a client and responds with the requested object The

proxy server then has the information the client originally

requested, and it returns that object to the client in step 4

Although figure 2.22 shows a single proxy server, http

al-lows multiple proxies to participate in satisfying a request

The proxies form a chain as in figure 2.23, handing off the

request from one to the other until the requested object can

be found The proxies then pass that object back to the client

in the reverse direction As each server processes a request, it

adds its own identity to the Via header in the request By the

time the request arrives at the ultimate final server, the Via

Via: proxy1 3

GET URI Via: proxy1, proxy2

1

Proxy Server GET

4 200 OK

2 GET

3 200 OK Web Browser

Figure 2.23

Proxy servers create or update the Via option as they relay requests or responses This option may make it easier to diagnose network problems.

Figure 2.22

A proxy server positions itself in between clients and servers It forwards requests on behalf of clients and relays responses from the servers

Trang 10

header will have captured the path taken by the request through the server chain The response follows the same process, with each intermediate system inserting its identity

in the Via header (Note that figure 2.23 shows only a partial Via header; for complete details, see section 3.2.50.)

Proxy servers perform several important functions for http communications The most common is in support of cach-ing, which section 2.4.4 discusses in more detail Other uses include enforcing policy for an organization A corporation can direct all its internal clients to use a proxy server to ac-cess the public Internet, allowing the proxy server to filter that Internet access appropriately Frequently this type of operation is part of a firewall Proxy servers have also been used to provide anonymity to Web browsers, preventing servers from discovering identifying information about actual clients

If, as is common, a proxy serves multiple origin servers, then the client must usually include the absolute uri in its re-quests Without the full uri, the proxy may not be able to tell which server the client wishes to contact Because this behavior is unusual for many clients, and because clients must know to send their requests to proxy servers rather than the ultimate destination, they must often be explicitly con-figured to use a proxy server Chapter 5 describes some of the mechanisms that system administrators can use to automati-cally configure proxy servers for their users

Gateways and tunnels operate very much like proxy servers; however, there are subtle differences Gateways act as an endpoint to a server chain, but they still rely on other servers

to provide all or part of the requested object In many cases, gateways use a protocol other than http to access the object

In figure 2.24, for example, the gateway uses the Structured Query Language to retrieve information from a database management system

Trang 11

While gateways act as a definite endpoint to a server chain,

tunnels are exactly the opposite As figure 2.25 indicates, they

are relatively transparent to the original client; the client may

not even be aware that a tunnel exists Tunnels do provide

some service, however In the example of figure 2.25, the

tun-nel establishes a secure connection to the actual server,

add-ing security to the communication between client and server

Note that although http 1.1 defines the operation of tunnels

in general terms, as of this writing few practical

implementa-tions are available

2.4.4 Cache Servers

Cache servers are a specialized type of proxy servers whose

main function is to improve Web performance They do that

by remembering the objects requested by clients and, if the

Server Tunnel

Figure 2.24

A gateway accepts HTTP requests and translates them to a different format such as SQL The gateway also ensures that any reply is a proper HTTP response

Trang 12

same object is requested again (either by the same client or a different client), returning the object that they’ve remem-bered instead of re-requesting it from the origin server Fig-ures 2.26 and 2.27 show the process

The first figure shows standard proxy operation The key to a cache server’s operation is that it remembers the requested object, generally by saving a copy on its local disk or in its memory

Figure 2.27 shows the payoff for the cache server In this ure, a new client requests the same object as in figure 2.26 This time, however, the cache server does not need to con-tact the origin server It simply returns the saved object from its local disk or memory

fig-Cache servers improve Web performance at both the client and the origin server For the client, they shorten the dis-tance to the object the client needs As figures 2.26 and 2.27 illustrate, a cache server may be located on the same local area network as its clients Local networks typically have higher bandwidth than wide area Internet connections, and the transmission delay across a local network is generally much less

Cache servers also improve performance by reducing the load on the origin server When a cache server returns an object to a client, that’s one less request to bother the origin

Origin Server Internet

1

Cache Server GET

4

200 OK

2 GET

Figure 2.26 Cache servers are proxy servers that

relay requests and responses In

addition, they keep a local copy of any

responses they receive.

Trang 13

server Fewer requests mean less processing and memory

re-sources that the origin server requires, as well as less

band-width it needs for its connection to the Internet

One of the more complicated issues facing a cache server is

knowing how long the objects it has stored in its cache

re-main valid Given the dynamic nature of the Web, an object

that an origin server returns at one moment may be

super-ceded by a new object in the next moment When that

hap-pens, the cache server must not return the object from its

cache, but, rather, it must query the origin server to

re-trieve the new object

As we’ll see in section 3.2, http 1.1 includes several headers

just to support cache servers Those headers tell cache servers

whether an object can be cached and, if so, how long it can

be safely stored Section 5.2 examines cache server operation

in more detail, focusing on those aspects outside the scope of

the http specification itself

2.4.5 Counting and Limiting Page Views

Whenever an intermediate cache server processes client

re-quests, the origin server can lose some control over its

inter-actions with clients In many ways that is a benefit, as cache

servers reduce the load on origin servers and can significantly

improve their performance There are some disadvantages,

Web Browser

Origin Server Internet

5

Cache Server

GET

Figure 2.27

When a new client asks for the same object, the cache server returns its local copy instead of sending another request all the way to the origin server This speeds up the response, and it saves bandwidth for the Internet connection

Trang 14

though For some Web sites, having a cache deliver pages to clients is a significant problem because it means the origin server does not know how often users view its content When the site derives revenue from advertising, being able

to count the number of site users may be critical to ing that revenue As a consequence, many Web servers delib-erately designate their content as non-cachable, even when caching is otherwise both possible and desirable The devel-opers of http have recognized this problem and introduced

maximiz-a technique thmaximiz-at maximiz-allows cmaximiz-aching maximiz-and yet still gives origin servers a way to count and, if desired, limit page views by the cache server clients This technique is an extension to the base http specification; it is documented in rfc 2777

The process begins when a proxy inserts a Meter header into

a request message as it forwards the message on (See section 3.2.35 for details of this header.) Steps 2 and 3 of figure 2.28 show this process By inserting the header here, the proxy

insert the Meter header in requests

passing through them Servers ask

for metering on a particular object

by including the Meter header in

their replies.

Trang 15

indicates its willingness to report on and limit the number of

times it returns the resulting response from its cache

The origin server responds to this invitation by including a

Meter header in its response This header tells the proxies

how to handle the object with respect to reporting and usage

limitations

Later, when another client requests the same object, the

proxies that have a cached copy will need to validate that

copy with the origin server When they do, as figure 2.29

shows, they update the Meter header in their requests This

meter information is a report of the number of times the

cached entry has been provided to clients

2.5 Cookies and State Maintenance

The http protocol normally operates as if each client

request is independent of all others The server responds to

any request strictly on the merits of that request, without

Trang 16

reference to other requests from the client (or, for that matter, any other client) This type of operation is known as

stateless because the server does not have to keep track of the

state of its clients

Because maintaining state requires server resources (memory, processing power, etc.), stateless operation is usually desir-able In some applications, however, the server needs to keep some state information about each of its clients Users that successfully log in to a Web site, for example, shouldn’t have

to log in again every time they view a different page on that site A server can avoid this inconvenience by tracking the state of the client The first time the client requests a page from the site, the server requires the user to log in As the user continues to browse the site and make additional http requests, however, the server remembers the previously suc-cessful login and refrains from requesting additional logins

2.5.1 Cookies

State maintenance requires one critical capability: Servers must be able to associate one http request with another The server must be able to tell, for example, that the user requesting a new page really is the same user that has already logged in, not a different user that has not been authorized The mechanism that http defines for state maintenance is

3 HTTP Request

+ Cookie

Figure 2.30 Servers can return state

management cookies in their

responses Clients, if they wish,

include those cookies in subsequent

requests to the same server.

Định dạng
Số trang	33
Dung lượng	814 KB