HandBooks Professional Java-C-Scrip-SQL part 140 pptx

Free at Last: An Efficient Method of Handling Variable-Length Records Introduction In this chapter we will develop an algorithm that uses the quantum file access method to handle a file

Trang 1

Free at Last: An Efficient Method of Handling Variable-Length Records

Introduction

In this chapter we will develop an algorithm that uses the quantum file access method to handle a file containing a large number of records each of which can vary dynamically in size In order to appreciate the power of this access method,

we will start by considering the much simpler problem of access to fixed-length records

Algorithm Discussed

The Quantum File Access Method

A Harmless Fixation

Let us suppose that we want to access a number of fixed-length customer records

by their record numbers.1 Given the record number, we can locate that customer's record by multiplying the record number by the length of a record, which gives us the offset into the file where that record will be found We can then read or write that record as needed

Of course, a real application needs to reuse records of customers who have become inactive, to prevent the customer file from growing indefinitely To handle this problem, we could set a limit on the file size and, when it is reached, start reusing records that haven't been referenced for a long time, making sure to correct or delete any records in other files that refer to the deleted customer

This fixed-length record system is not very difficult to implement, but it has

significant drawbacks; address fields, for example, tend to vary greatly in length, with some records needing 25 or 30 characters for a city name or street address and others needing only 10 If we allocate enough storage for the extreme case, the records become much longer than if we had to handle only the average case

However, allocating enough for only the average case will leave those customers whose names or addresses won't fit into the allotted space quite unhappy, as I know from personal experience as a software developer! The obvious solution is to allow the fields (and therefore the records) to vary in length as necessary

Checking Out of the Hotel Procrustes

Trang 2

Unfortunately, variable-length records are much more difficult to deal with than fixed-length ones The most obvious reason, as discussed above, is that

determining where fixed-length records are stored requires only a simple

calculation; this is not true of variable-length records However, we could remedy

this fairly easily by maintaining a record index consisting of an array of structures

containing the starting location and length of each record, as depicted in Figure recindex.2

A sample record index array and data for variable-length records (Figure

recindex)

+- Starting Record

| Index Address Length

| + -+

| 0 | 0 12 + +

Record | 1 | 12 12 + ++

Index | 2 | 24 7 + ++ -+

Array | 3 | 31 2 + ++ -+ -+

| 4 | 33 5 + ++ -+ -+-+

| + -+ || | | |

+- + -+| | | |

| + -+ | | |

+- | | | | |

| ++ -+ -+ -+-+ +

Record | |Steve HellerP.O.Box 0335BaldwinNY11510|

Data | + -+

+-

Record | 0000000000111111111122222222223333333333

Offset | 0123456789012345678901234567890123456789

+-

We encounter a more serious difficulty when we want to delete a record and reuse the space it occupied.3 In some situations we can sidestep the problem by adding the new version of the record to the end of the file and changing the record pointer

to refer to the new location of the record; however, in the case of an actively

updated file such an approach would cause the file to grow rapidly

But how can we reuse the space vacated by a deleted record of some arbitrary size? The chances of a new record being exactly the same size as any specific deleted one are relatively small, especially if the records average several hundred bytes

Trang 3

each, as is not at all unusual in customer data files A possible solution is to keep a separate free list for each record size and reuse a record of the correct size

However, there is a very serious problem with this approach: a new record may need 257 bytes, for example, and there may be no available record of exactly that size Even though half of the records in the file might be deleted, none of them could be reused, and we would still be forced to extend the file The attempt to solve this difficulty by using a record that is somewhat larger than necessary leads

to many unusably small areas being left in the file (a situation known as

fragmentation)

However, there is a relatively unknown way to make variable-length records more tractable: the quantum file access method.4 The key is to combine them into groups

of fixed length, which can then be allocated and deallocated in an efficient manner

The Quantum File Access Method

Before the following discussion will make much sense to you, I will need to

explain in general terms what we're trying to accomplish: building a virtual

memory system that can accommodate records of varying lengths in an efficient manner This means that even though at any given time, we are storing most of our data on the disk rather than maintaining it all in memory, we will provide access to all the data as though it were in memory To do this, we have to arrange that any actively used data is actually in memory when it is needed In the present

application, our data is divided into fixed-size blocks called quanta (plural of

quantum),5 so the task of our virtual memory system is to ensure that the correct blocks are in memory as needed by the user.6 The quanta in the file are generally

divided into a number of addressable units called items.7 When adding a record to the file, we search a free space list, looking for a quantum with enough free space for the new record When we find one, we add the record to that quantum and store

the record's location in the item reference array, or IRA, which replaces the

record index in Figure recindex; this array consists of entries of the form "quantum number, item number".8 The item number refers to an entry in the item index

stored at the beginning of the quantum; the items are stored in the quantum in order

of their item index entries, which allows the size of an item to be calculated rather than having to be stored

For example, if we were to create an array of variable-length strings, some of its item references might look like those illustrated in Figure itemref1

Trang 4

Sample IRA, item index, and data, before deletions (Figure itemref1)

+-

| Quantum Item

| Index Number Number

| + -+

Item | 0 | 3 1 + -+

Reference | 1 | 3 2 + -+-+

Array | 2 | 3 3 + -+-+ +

(IRA) | 3 | 3 4 + -+-+ +-+

| 4 | 3 5 + -+-+ +-+-+

| + -+ | | | | |

+- | | | | |

| | | | |

| + -+ | | | | |

| 1 | 12 -+ VARSTRING 0 |± + | | | |

| 2 | 24 +| VARSTRING 1 |± + | | |

Item | 3 | +-31 || VARSTRING 2 |± -+ | |

Index | 4 | ++-33 || VARSTRING 3 |± -+ |

for | 5 |+++-38 || VARSTRING 4 |± -+

Quantum | 6 |||| 38 || UNUSED 0 |

3 | 7 |||| 38 || UNUSED 0 |

| 8 |||| 38 || UNUSED 0 |

| 9 |||| 38 || UNUSED 0 |

| 10 |||| 38 || UNUSED 0 |

| ++++ ++ -+

+- ||| |+ -+

||| + -+ |

||+ + | |

+- |+ -+ | | |

| + + +-+ -+ -+ -+

Quantum | | 11510NYBaldwinP.O.Box 0335Steve Heller|

Data | + -+

+-

Quantum | 333333333222222222211111111110000000000

Offset | 876543210987654321098765432109876543210

+-

Trang 5

When we delete an item from a quantum, we have to update the free space list

entry for that quantum to reflect the amount freed, so the space can be reused the

next time an item is to be added to the file We also have to slide the remaining

items in the quantum together so that the free space is in one contiguous block,

rather than in slivers scattered throughout the quantum With a record index like

the one in Figure recindex, we would have to change the record index entries for

all the records that were moved Since the records that were moved might be

anywhere in the record index, this could impose unacceptable overhead on

deletions; to avoid this problem, we will leave the item index entry for a deleted

item empty rather than sliding the other entries down in the quantum, so that the

IRA is unaffected by changes in the position of records within a quantum If we

delete element 1 from the array, the resulting quantum looks like Figure itemref2

Sample IRA, item index, and data (Figure itemref2)

+-

| Quantum Item

| Index Number Number

| + -+

Item | 0 | 3 1 + -+

Reference | 1 | NONE 0 | |

Array | 2 | 3 3 + -+ +

(IRA) | 3 | 3 4 + -+ +-+

| 4 | 3 5 + -+ +-+-+

| + -+ | | | |

+- | | | |

| | | |

+- Item # Offset Type Index | | | |

| + -+ | | | |

| 1 | 12 -+ VARSTRING 0 |± + | | |

| 2 | 12 | UNUSED 0 | | | |

Item | 3 | 19-+| VARSTRING 2 |± -+ | |

Index | 4 | + 21 || VARSTRING 3 |± -+ |

for | 5 |++ 26 || VARSTRING 4 |± -+

Quantum | 6 ||| 38 || UNUSED 0 |

3 | 7 ||| 38 || UNUSED 0 |

| 8 ||| 38 || UNUSED 0 |

| 9 ||| 38 || UNUSED 0 |

| 10 ||| 38 || UNUSED 0 |

| +++ -++ -+

+- || |+ -+

Định dạng
Số trang	6
Dung lượng	23,98 KB