THE FRACTAL STRUCTURE OF DATA REFERENCE- P27 pot

120 THE FRACTAL STRUCTURE OF DATA REFERENCE In addition, the access density A and storage intensity q are, in effect, inverses of each other: This relationship applies, not just to indi

Trang 1

120 THE FRACTAL STRUCTURE OF DATA REFERENCE

In addition, the access density A and storage intensity q are, in effect,

inverses of each other:

This relationship applies, not just to individual applications, but also to aggre-gates of applications, since the average access density per unit of storage is given by:

We can therefore conclude from (9.11) that

(9.12)

In words, this says that as the cost of disk storage falls, the access density of applications should also be expected to fall, but at a slower rate Note, however, that the deployable applications model does not predict how much of a time lag should be expected between these events

Equation (9.12) provides a convenient method of model calibration As Figure 9.2 illustrates, both storage cost and access density declined steadily throughout the 1980’s and early 1990’s From 1980 to 1993, storage costs fell at a compound annual rate of approximately 15 percent (from about 39 to about 5 dollars per megabyte), while access densities fell at a compound annual rate of approximately 11 percent (from about 9 to about 2.1 I/O’s per second per gigabyte) Due to the reasonably steady nature of the process during this extended period of time, we can therefore conclude, even without knowing the specific time lag between cause and effect, that

or

(9.13) Here we have added slightly to the exact calculation so as to express β^

as a round number The upward direction of round-off is the conservative direction;

it corresponds, in the subsequent section, to adopting a slightly more demanding objective for disk performance than would have been the case if we had carried forward additional digits

Trang 2

Disk Applications: A Statistical View 121

Figure 9.2. Approximate trends in access density and storage cost

The results just obtained apply directly to the assessment of disk performance for new generations of disks For concreteness, consider the case in which, compared to GOODDISK, GOODDISK' has twice the capacity and half the cost per unit of storage Then by (9.11), we should expect that, as a result of deploying

GOODDISK', the average storage intensity of applications will increase by a factor of 21-.3 = 1.62 However, the amount of storage per disk increases by

a factor of 2 Therefore, we must expect the net load per disk to increase by a factor of 2/1.62 = 1.23

In order for performance to stay “in balance” with the projected application requirements, the servicing of I/Orequests must therefore speed up by enough

to allow a 23 percent throughput increase

Suppose, hypothetically, that we have adopted a fixed objective for the response time per I/O Then an increase in throughput by some factor 1 +δ (for example, the factor of 1.23 needed in the present analysis) can be achieved by reducing the service time per I/O by some corresponding factor 1 – where

we would expect that <δ

While theoretically appealing, however, the reasoning just outlined does not

“ring true” It is too risky to maintain a fixed response time objective while allowing service times to vary, since queue times may then also vary The larger the queue time grows relative to service time, the more erratic the performance perceived by users of the system is likely to become

Trang 3

122 THE FRACTAL STRUCTURE OF DATA REFERENCE

For this reason, we do not adopt a fixed response time objective for the

purpose of the present analysis Instead, we aim to ensure performance stability

by controlling the ratio of queue time to service time The net result of this

requirement is to force response times and service times to fall together.

If the load across a collection of disks is uniform, then to prevent the ratio

of queue time versus service time from increasing, utilization must remain constant So for the case of uniform disk load, we must require that the reduction in disk service time match the expected increase in disk load: a service time reduction of 23 percent is needed

The reference [44] examines how to meet a similar objective in the case of a skewed environment Given typical variations of disk load, the desired stability can be achieved by meeting the following condition:

where it should be recalled that D represents the service time per I/O and y is

the average I/O rate per disk To ensure that both GOODDISKandGOODDISK' meet an objective of this form equally well, given that the load of the latter disk increases by a factor of 1 + 6 and its service time decreases by a factor of 1 –

we require that:

Since 1, we can simplify this condition using first-order Taylor

(9.14)

Fortuitously, the factor that appears in prens on the right side of (9.14) is rather insensitive to the actual I/O load per disk y provided that it is in a

“reasonable” range For example, if y is in the range 9 ≤ y ≤ 49, then the factor on the right side of (9.14) is in the range 1.39 ≤ factor 1.6 For

“back-of-the-envelope” purposes, then, we can state the result of (9.14) as follows: in a skewed environment, the average throughput which a disk can sustain increases by a percentage roughly half again as large as the percentage

by which the disk’s service time per I/Ocan be reduced

To achieve the throughput improvement of 23 percent that is needed for

GOODDISK', we therefore conclude that a reduction in service time in the range

of 15 percent (for typical disk skews) to 23 percent (for no skew) will be required

Since the results just stated might leave the impression that a skewed environ-ment has some performance advantage compared with a uniform distribution of

I/Oacross the disks supporting an application, it is important to emphasize that expansions:

Trang 4

Disk Applications: A Statistical View 123 the reverse is actually the case Any disk can deliver its best level of throughput per actuator in a uniform environment The performance degradation due to skew is less, however, for a larger-capacity disk than for a smaller one The required improvement in service time needed in deploying GOODDISK', as just stated above, takes into account this effect

The needed reduction in service time can be (and historically, has been) accomplished in many ways These include faster media data rate, shorter seek time, shorter latency, schemes to access the disk via multiple paths, higher path bandwidth, disk buffering and/or storage control cache, and many others What if GOODDISK' does not deliver the needed improvement in service

time? For example, what if the I/O capabilities of GOODDISK' and GOODDISK

are exactly the same?

The case of no improvement in performance is a useful extreme to examine

It helps to illustrate the difference between the conclusions of the deployable applications model, as just presented above, and those which would be reached

by adopting performance objectives based upon access density

Suppose that in some specific environment where GOODDISKis in use, stor-age capacity and performance are in perfect balance, so that GOODDISK’S I/O

capability and its capacity both run out at the same time Also, suppose that the

I/O capabilities of GOODDISK' and GOODDISKare the same If we reason from performance objectives based upon access density, we must then conclude that the extra capacity offered by GOODDISK' has no value in the given environment, because it cannot be used Therefore, we must consider that GOODDISK' has

the same effective storage cost as GOODDISK, despite the fact that GOODDISK' offers twice the capacity at the same price

Given these circumstances, the deployable applications model draws a dif-ferent conclusion It projects that the lower cost per unit of storage will enable

a range of new applications, causing average access density to decrease, and average storage intensity to increase, by a factor of 1.62 Therefore, we can use up to 62 percent of the added capacity offered by GOODDISK' As a result,

GOODDISK' reduces the effective cost of storage by a factor of 1.62

Given that GOODDISK' offers twice the storage of GOODDISK for the same

price, the conclusion that some reduction of effective costs must occur as

the result of deploying GOODDISK' seems compelling As just shown, the deployable applications model provides a way to quantify the resulting effective cost, while also accounting for the performance of the new disk Pulling the reasoning about GOODDISK and GOODDISK' into a systematic procedure, the steps of the method are:

1 Assume that disk capacity and performance are initially “in balance” More specifically, assume that the level of capacity use at which the old disk’s

I/O capability is exhausted (the usable capacity) is the same as its physical

capacity

Trang 5

2 Estimate the change y'/ y in the I/O capability due to the new disk For an environment with no skew of load across disks, the I/O capability should

be expected to increase by the same ratio as the decrease in device service

time For an environment with skew, the change y'/ y can be estimated based

upon (9.14); or, as a “rule of thumb”, the I/O capability can be increased by

half again the percentage by which the service time falls The factor y'/ y

represents an increase in usable capacity that comes with the new disk

3 Use (9.11) to estimate the change q- /q- in storage intensity due to

applica-tions that the new disk enables This factor also represents an increase in the usable capacity

4 For performance to remain in balance with capacity, all of the new disk’s physical capacity must continue to be usable:

THE FRACTAL STRUCTURE OF DATA REFERENCE

(9.15)

or equivalently,

(9.16) (with equality in the case where the disk remains in perfect balance)

5 If the new disk satisfies (9.16), its effective storage cost E' net is the same as

its nominal cost E' If the new disk fails to satisfy (9.16), then its effective

storage cost exceeds the nominal cost in proportion to the shortfall:

(9.17)

Equations (9.16) and (9.17) can be illustrated by validating our previous conclu-sions about GOODDISK' We previously reasoned that an increase of 23 percent

inI/Ocapability was needed for GOODDISK' to remain in balance, given an ex-pected increase in storage intensity by a factor of 1.62 This agrees with (9.16), since 1.62 x 1.23 = c’/c = 2 If GOODDISK' delivers at least the required 23 percent improvement, then its effective cost will be the same as its nominal cost

We also reasoned that if GOODDISK' offers no improvement in performance, then its effective cost would be a factor of 1.62 lower than that of GOODDISK This agrees with (9.17), since in this case the larger term within the

maximiza-tion yields a right side equal to E' x 2 / (1 x 1.62) = (E / 2) x 2/1.62 = E / 1.62

4 CONCLUSION

As promised, we have applied the deployable applications model to demon-strate a cause-and-effect mechanism behind the historical linkage that has

Tiêu đề	The Fractal Structure Of Data Reference
Trường học	Standard University
Chuyên ngành	Computer Science
Thể loại	Thesis
Năm xuất bản	2023
Thành phố	New York

Định dạng
Số trang	5
Dung lượng	122,34 KB