HandBooks Professional Java-C-Scrip-SQL part 155 pdf

Some blocks used to store internal tables such as the free space list are not divided into items, but consist of an array of fixed-size elements, sometimes preceded by a header structure

Trang 1

Footnotes

1 Obviously, this is a simplification: normally, we would want to be able to find a customer's record by his name or other salient characteristic

However, that part of the problem can be handled by, for example, a hash coded lookup from the name into a record number, as we will see in the next chapter Here we are concerned with what happens after we know the record number

2 This figure could just as easily be considered a layout for a single record with variable-length fields; however, the explanation is valid either way

3 The problem of changing the length of an existing record can be handled by deleting the old version of the record and adding a new version having a different length

4 I am indebted for this extremely valuable algorithm to its inventor, Henry Beitz, who generously shared it with me in the mid-1970's

5 In general, I use the terms "quantum" and "block" interchangeably; in the few cases where a distinction is needed, I will note it explicitly

6 In the current implementation, the default block size is 16K However, it is easy to change that size in order to be able to handle larger individual items

or to increase storage efficiency

7 Some blocks used to store internal tables such as the free space list are not divided into items, but consist of an array of fixed-size elements, sometimes preceded by a header structure describing the array

8 For simplicity, in our sample application each user record is stored in one item; however, since any one item must fit within a quantum, applications dealing with records that might exceed the size of a quantum should store each potentially lengthy field as a separate item

9 Four of these bytes are used to hold the index in the IRA to which this item corresponds; this information would be very helpful in reconstructing as much as possible of the file if it should become corrupted Another two bytes are used to keep track of the type of the item These entries are for error trapping and file reconstruction if the file should somehow become corrupted

10 Of course, this assumes that we have set the parameters of the quantum file

to values that allow us to expand the file to a size large enough to hold that much data The header file "blocki.h" contains the constants BlockSize and MaxFileQuantumCount, which together determine the maximum size of a quantum file The beginning of that file also contains a number of other constants and structures related to this issue; you should be able to modify

Trang 2

the capacity and space efficiency of a quantum file fairly easily after

examining that header file

11 The limit is 256 so that an object number can fit into one byte; this reduces the size of the free space list, as we will see later

12 By the way, there is a possible optimization that could be employed here: sorting the buffers to be rewritten to the disk in order of their quantum

numbers (i.e., their positions in the file) This could improve performance in systems where the hard disk controller doesn't already provide this service; however, most (if not all) modern disk systems take care of this for us, so sorting by quantum number would not provide any benefit

13 The second edition also included this C implementation along with an

earlier, less capable, version of the C++ implementation we're examining here

14 Another restriction in C++ operator overloading is that it's impossible to make up your own operators According to Bjarne Stroustrup, this facility has been carefully considered by the standards committee and has failed of adoption due to difficulties with operator precedence and binding strength Apparently, it was the exponentiation operator that was the deciding factor; it's the first operator that users from the numerical community usually want

to define, but its mathematical properties don't match the precedence and binding rules of any of the "normal" C++ operators

15 I'm not claiming this is a good use for overloading; it's only for tutorial

purposes

16 Warning: do not compile and execute the program in Figure overload2 Although it will compile, it reads from random locations, which may cause a core dump on some systems

17 By the way, this isn't just a theoretical problem: it happened to me during the development of this program

18 This is an example of the "handle/body" class paradigm, described in

Advanced C++: Programming Styles and Idioms, by James O Coplien

(Addison-Wesley Publishing Company, Reading, Massachusetts, 1992)

Warning: as its title indicates, this is not an easy book; however, it does

reward careful study by those who already have a solid grasp of C++

fundamentals For a kinder, gentler introduction to several advanced C++

idioms of wide applicability, see my Who's Afraid of More C++? (AP

Professional, San Diego, California, 1998)

19 In order to solve this problem in a more general way, the ANSI standards committee for C++ has approved the addition of "namespaces", which allow the programmer to specify the library from which one or more functions are

to be taken in order to prevent name conflicts

Trang 3

20 Of course, there are other ways to accomplish the goal of protecting the class user from concern about internals of a given class, as we've discussed briefly

in the sections titled "Data Hiding" and "Function Hiding"

21 If we had any functions that could change the contents of a shared object, they would also have to be modified to prevent undesirable interactions between "separate" handle objects that share data However, we don't have any such functions in this case

22 Actually, the reference count should never be less than 0, but I'm engaging

in some defensive programming here

23 To reduce the length of the function names in this class, I'm going to omit the MainObjectArrayPtr qualifier at the beginning of those names

24 We'll see exactly how this block access works when we cover the

MainObjectBlock class, but for now it's sufficient to note that the main

object array is potentially divided into blocks which are accessed via the standard virtual memory system

25 Actually, the name of the "lowest free object" variable should be something like m_StartLookingHere, but I doubt it will cause you too much confusion after you see how it is used

26 If a preprocessor variable called DEBUG is defined, then the action of this macro will be to terminate the program if the condition is not met;

otherwise, it will do nothing You can find the implementation of this macro and its underlying function in qfassert.h and qfassert.cpp

27 We often will step through an array assigning values to each element in turn, for example when importing data from an ASCII file; since we are not

modifying previously stored values, the quantum we used last is the most likely to have room to add another item In such a case, the most recently written-to quantum is half full on the average; this makes it a good place to look for some free space In addition, it is very likely to be already in

memory, so we won't have to do any disk accesses to get at it

28 By the way, this function was originally named GetFreeSpace, and its return type was called FreeSpaceEntry, but I had to change the name of the return type to avoid a conflict with a name that Microsoft had used once upon a time in their MFC classes and still had some claim on; I changed the name

of the function to match This is a good illustration of the need to avoid polluting the global name space; using the namespace construct in the new C++ standard would be a good solution to such a problem

29 The alert reader will notice that the type of the NewBigPointerBlock

variable is BigPointerBlockPtr, not BigPointerBlock However, the

functions that we call through that variable via the operator-> are from the

Trang 4

BigPointerBlock class, because that is the type of the pointer that operator-> returns, as explained in the section on overloading operator->

30 Another possible use is to implement variant arrays, in which the structure of the array is variable In that case, we might use the type to determine

whether we have the item we want or some kind of intermediate structure requiring further processing to extract the actual data

31 The reason we mark this quantum as being full is twofold: first, there won't

be very much (if any) space left in this quantum after we have added the little pointer array; and second, we don't want to store any actual data for our new main object in the same quantum as we are using for a section of the little pointer array, to make the reconstruction of a partially corrupted file easier

32 There's one exception to this rule, for reasons described above: if the user specifies 0 elements, I change it to 1

33 It would probably have been better to create a class to contain this function

as well as a few others that are global in this implementation

34 These are FreeSpaceBlockPtr, MainObjectBlockPtr, BigPointerBlockPtr, LittlePointerBlockPtr, and LeafBlockPtr

35 Of course, while stepping through the program at human speeds in Turbo Debugger, the timestamps were nicely distributed; this is a demonstration of Heisenberg's Uncertainty Principle as it applies to debugging

36 In a 32-bit implementation, it's entirely possible that the counter will never turn over, as its maximum value is more than four billion However, if you let the program run long enough, eventually that will occur; at full tilt, such

an event might take a few months

37 At least, it's the smallest interface for a class that actually does anything As we'll see, this program contains one class that doesn't contribute either data

or functions I'll explain why that is when we get to it

38 For a detailed example of how this works, see the section entitled "Polite Pointing"

39 I'm not going to prefix the name of each embedded function or class with the name of its enclosing class, to make the explanations shorter; I'll just include

it in the title of the main section discussing the class

40 I'll cover this and the other global functions in the next chapter

41 It is important to remember that the last item in a quantum is actually at the lowest address of any item in that quantum, because items are stored in the quantum starting from the end and working back toward the beginning of the quantum

Trang 5

42 We can't delete unused item index entries that aren't at the end of the index because that would change the item numbers of the following items,

rendering them inaccessible

43 Of course, in the real program, we will find the IRA by looking it up in the main object index, but that detail is irrelevant to the current discussion of deleting an element

44 The statement that calculates the value we should return for an empty

quantum may be a bit puzzling The reason it works as it does is that the routine that looks for an empty block compares the available free space code

to a specific value, namely AvailableQuantum If we did not return a value that corresponded to that free space code, a quantum that was ever used for anything would never be considered empty again

45 It is important to note that the number of elements that I'm referring to here

is not the number of elements in the big pointer array (i.e., the number of quanta that the small pointer array occupy) but the actual number of

elements in the array itself (i.e., the number of data items that the user has stored in the array)

46 Note that we have to remember to set the modified flag for the big pointer array quantum whenever we change a value in the big pointer array header

or the big pointer array itself, so that these changes will be reflected on the disk rather than being lost

47 Although we will be unable to go into the details of the implementation of the AccessVector type, it is defined in the header file vector.h; if you are familiar with C++ templates, I recommend that you read that header file to understand how this type actually works

48 The "last quantum added to" variable is stored in the big pointer quantum When first writing the code to update that variable, I had forgotten to update the "modified" flag for the big pointer quantum when the variable actually changed As a result, every time the buffer used for the big pointer quantum was reused for a new block, that buffer was overwritten without being

written out to the disk When the big pointer quantum was reloaded, the "last quantum added to" variable was reset to an earlier, obsolete value, with the result that a new quantum was started unnecessarily This error caused the file to grow very rapidly

49 For a similar reason, if we were adding an item to a large object with many little pointer arrays, each of which contained only a few distinct quantum number references, we wouldn't be gathering information about very much

of the total storage taken up by the object; we might very well start a new quantum when there was plenty of space in another quantum owned by this object

Định dạng
Số trang	6
Dung lượng	27,12 KB