Most will have iostreams, non-templated c co om pl le ex x, a different s st ri in ng g class, and the C standard library.. Older implementations have standard library facilities in the
Trang 1Section A.7.1 Declarators 807
Trang 2decl-specifier-seq declarator = assignment-expression
decl-specifier-seq abstract-declarator opt
decl-specifier-seq abstract-declarator opt = assignment-expression
function-definition:
decl-specifier-seq opt declarator ctor-initializer opt function-body
decl-specifier-seq opt declarator function-try-block
A v vo ol at ti le e specifier is a hint to a compiler that an object may change its value in ways not specified
by the language so that aggressive optimizations must be avoided For example, a real time clockmight be declared:
Trang 3Section A.8 Classes 809
class-head:
class-key identifier opt base-clause opt
class-key nested-name-specifier identifier base-clause opt
class-key nested-name-specifiertemplatetemplate-id base-clause opt
member-declaration member-specification opt
access-specifier : member-specification opt
declarator pure-specifier opt
declarator constant-initializer opt
identifier opt : constant-expression
Constant expressions are defined in §C.5
A.8.1 Derived Classes
See Chapter 12 and Chapter 15
base-clause:
: base-specifier-list
Trang 4base-specifier
base-specifier-list , base-specifier
base-specifier:
:: opt nested-name-specifier opt class-name
virtual access-specifier opt :: opt nested-name-specifier opt class-name
access-specifier virtual opt :: opt nested-name-specifier opt class-name
access-specifier:
private
protected
public
A.8.2 Special Member Functions
See §11.4 (conversion operators), §10.4.6 (class member initialization), and §12.2.2 (base ization)
Trang 5Section A.8.3 Overloading 811
operator: one of
+ - * / % ^ & | ~ ! = < > += -= *= /= %= ^= &= |= << >> >>= <<= ==
classidentifier opt
classidentifier opt = type-id
typenameidentifier opt
typenameidentifier opt = type-id
template <template-parameter-list > class identifier opt
template <template-parameter-list > class identifier opt = template-name
Trang 6f
f< a a>b b>(0 0) ; / /syntax error
f
f< (a a>b b) >(0 0) ; / /ok
A similar lexical ambiguity can occur when terminating>s get too close For example:
l li is t<v ve ct to or r<i in t>> l lv v1 1; / /syntax error: unexpected>>(right shift)
l li is t< v ve ct to or r<i in t> > l lv v2 2; / /correct: list of vectors
Note the space between the two>s;>>is the right-shift operator That can be a real nuisance
A.10 Exception Handling
See §8.3 and Chapter 14
Trang 7Section A.10 Exception Handling 813
A.11 Preprocessing Directives
The preprocessor is a relatively unsophisticated macro processor that works primarily on lexicaltokens rather than individual characters In addition to the ability to define and use macros (§7.8),the preprocessor provides mechanisms for including text files and standard headers (§9.2.1) andconditional compilation based on macros (§9.3.3) For example:
# if constant-expression new-line group opt
# ifdef identifier new-line group opt
# ifndef identifier new-line group opt
Trang 8# include pp-tokens new-line
# define identifier replacement-list new-line
# define identifier lparen identifier-list opt ) replacement-list new-line
# undef identifier new-line
# line pp-tokens new-line
# error pp-tokens opt new-line
# pragma pp-tokens opt new-line
Trang 9You go ahead and follow your customs,
and I´ll follow mine.
– C Napier
C/C++ compatibility — silent differences between C and C++ — C code that is not C++
— deprecated features — C++ code that is not C — coping with older C++ tions — headers — the standard library — namespaces — allocation errors — templates
implementa-— for-statement initializers implementa-— advice implementa-— exercises.
B.1 Introduction
This appendix discusses the incompatibilities between C and C++ and between Standard C++ (asdefined by ISO/IEC 14882) and earlier versions of C++ The purpose is to document differencesthat can cause problems for the programmer and point to ways of dealing with such problems.Most compatibility problems surface when people try to upgrade a C program to a C++ program,try to port a C++ program from one pre-standard version of C++ to another, or try to compile C++using modern features with an older compiler The aim here is not to drown you in the details ofevery compatibility problem that ever surfaced in an implementation, but rather to list the most fre-quently occurring problems and present their standard solutions
When you look at compatibility issues, a key question to consider is the range of tions under which a program needs to work For learning C++, it makes sense to use the most com-plete and helpful implementation For delivering a product, a more conservative strategy might be
implementa-in order to maximize the number of systems on which the product can run In the past, this hasbeen a reason (and sometimes just an excuse) to avoid C++ features deemed novel However,implementations are converging, so the need for portability across platforms is less cause forextreme caution than it was a couple of years ago
Trang 10B.2 C/C++ Compatibility
With minor exceptions, C++ is a superset of C (meaning C89, defined by ISO/IEC 9899:1990).Most differences stem from C++’s greater emphasis on type checking Well-written C programstend to be C++ programs as well A compiler can diagnose every difference between C++ and C
C++ provides the/ /comments; C does not (although many C implementations provide them as
an extension) This difference can be used to construct programs that behave differently in the twolanguages For example:
C99 (meaning C as defined by ISO/IEC 9899:1999(E)), also provides/ /
A structure name declared in an inner scope can hide the name of an object, function, tor, or type in an outer scope For example:
B.2.2 C Code That Is Not C++
The C/C++ incompatibilities that cause most real problems are not subtle Most are easily caught
by compilers This section gives examples of C code that is not C++ Most are deemed poor style
or even obsolete in modern C
In C, most functions can be called without a previous declaration For example:
m ma ai n() /* poor style C Not C++*/
Trang 11recom-Section B.2.2 C Code That Is Not C++ 817
options to enforce it, C code conforms to the C++ rule Where undeclared functions are called, youhave to know the functions and the rules for C pretty well to know whether you have made a mis-
take or introduced a portability problem For example, the previous m ma ai n()contains at least twoerrors as a C program
In C, a function declared without specifying any argument types can take any number of ments of any type at all Such use is deemed obsolescent in Standard C, but it is not uncommon:
argu-v vo oi d f f() ; /*argument types not mentioned*/
v vo oi d f f(a a,p p,c c) c ch ha ar r*p p; c ch ha ar r c c; { /* */ } /*C Not C++*/
Such definitions must be rewritten:
v vo oi d f f(i in t a a, c ch ha ar r* p p, c ch ha ar r c c) { /* */ }
In C and in pre-standard versions of C++, the type specifier defaults to i in t For example:
c co ns st t a a=7 7; /* In C, type int assumed Not C++*/
C99 disallows ‘‘implicit i in t,’’ just as in C++.
C allows the definition of s st ru uc ct ts in return type and argument type declarations For example:
s st ru uc ct t S S{i in t x x,y y; } f f() ; /*C Not C++*/
v vo oi d g g(s st ru uc ct t S S{i in t x x,y y; } y y) ; /*C Not C++*/
The C++ rules for defining types make such declarations useless, and they are not allowed
In C, integers can be assigned to variables of enumeration type:
e en nu um m D Di ir re ct ti io on n{u up p, d do ow n};
e en nu um m D Di ir re ct ti io on n d d=1 1; /* error: int assigned to Direction; ok in C*/
C++ provides many more keywords than C does If one of these appears as an identifier in a C gram, that program must be modified to make it a C++ program:
Trang 12In C, some of the C++ keywords are macros defined in standard headers:
This implies that in C they can be tested using#i if fd de ef f, redefined, etc.
In C, a global data object may be declared several times in a single translation unit without
using the e ex xt er rn n specifier As long as at most one such declaration provides an initializer, the
object is considered defined only once For example:
i in t i i; i in t i i; /* defines or declares a single integer ‘i’; not C++*/
In C++, an entity must be defined exactly once; §9.2.3
In C++, a class may not have the same name as a t ty yp ed ef f declared to refer to a different type in
the same scope; §5.7
In C, a v vo oi d*may be used as the right-hand operand of an assignment to or initialization of avariable of any pointer type; in C++ it may not (§5.6) For example:
C allows transfer of control to a labeled-statement (§A.6) to bypass an initialization; C++ does not.
In C, a global c co ns st t by default has external linkage; in C++ it does not and must be initialized, unless explicitly declared e ex xt er rn n (§5.4).
In C, names of nested structures are placed in the same scope as the structure in which they arenested For example:
The keyword s st ta ti ic c, which usually means ‘‘statically allocated,’’ can be used to indicate that a
function or an object is local to a translation unit For example:
Trang 13Section B.2.3 Deprecated Features 819
This program genuinely has two integers called g gl ob b Each g gl ob b is used exclusively by functions
defined in its translation unit
The use of s st ta ti ic c to indicate ‘‘local to translation unit’’ is deprecated in C++ Use unnamed
namespaces instead (§8.2.5.1)
The implicit conversion of a string literal to a (non-c co on ns st t) c ch ha ar r*is deprecated Use named
arrays of c ch ha ar r or avoid assignment of string literals to c ch ha ar r*s (§5.2.2)
C-style casts should have been deprecated when the new-style casts were introduced mers should seriously consider banning C-style casts from their own programs Where explicit
Program-type conversion is necessary, s st ta ti ic c_ _c ca as st t, r re ei nt er rp re et t_ _c ca as st t, c co on ns st t_ _c ca as st t, or a combination of these
can do what a C-style cast can The new-style casts should be preferred because they are moreexplicit and more visible (§6.2.7)
B.2.4 C++ Code That Is Not C
This section lists facilities offered by C++ but not by C The features are sorted by purpose ever, many classifications are possible and most features serve multiple purposes, so this classifica-tion should not be taken too seriously
How-– Features primarily for notational convenience:
[1] / /comments (§2.3); added to C99
[2] Support for restricted character sets (§C.3.1); partially added to C99
[3] Support for extended character sets (§C.3.3); added to C99
[4] Non-constant initializers for objects in s st ta ti ic c storage (§9.4.1)
[5] c co on ns st t in constant expressions (§5.4, §C.5)
[6] Declarations as statements (§6.3.1); added to C99
[7] Declarations in for-statement initializers (§6.3.3); added to C99
[8] Declarations in conditions (§6.3.2.1)
[9] Structure names need not be prefixed by s st ru uc ct t (§5.7)
– Features primarily for strengthening the type system:
[1] Function argument type checking (§7.1); later added to C (§B.2.2)
[2] Type-safe linkage (§9.2, §9.2.3)
[3] Free store management using n ne ew w and d de el et e (§6.2.6, §10.4.5, §15.6)
[4] c co on ns st t (§5.4, §5.4.1); later added to C
[5] The Boolean type b bo ol l (§4.2); partially added to C99
[6] New cast syntax (§6.2.7)
– Facilities for user-defined types:
[1] Classes (Chapter 10)
[2] Member functions (§10.2.1) and member classes (§11.12)
[3] Constructors and destructors (§10.2.3, §10.4.1)
[4] Derived classes (Chapter 12, Chapter 15)
Trang 14[5] v vi rt tu ua al l functions and abstract classes (§12.2.6, §12.3)
[6] Public/protected/private access control (§10.2.2, §15.3, §C.11)
[6] Explicit scope qualification (operator: :; §4.9.4)
[7] Exception handling (§8.3, Chapter 14)
[8] Run-time Type Identification (§15.4)
The keywords added by C++ (§B.2.2) can be used to spot most C++-specific facilities However,
some facilities, such as function overloading and c co on ns st ts in constant expressions, are not identified
by a keyword In addition to the features listed, the C++ library (§16.1.2) is mostly C++ specific.The_ _c cp pl us pl us s macro can be used to determine whether a program is being processed by a C
or a C++ compiler (§9.2.4)
B.3 Coping with Older C++ Implementations
C++ has been in constant use since 1983 (§1.4) Since then, several versions have been defined andmany separately developed implementations have emerged The fundamental aim of the standardseffort was to ensure that implementers and users would have a single definition of C++ to workfrom Until that definition becomes pervasive in the C++ community, however, we have to dealwith the fact that not every implementation provides every feature described in this book
It is unfortunately not uncommon for people to take their first serious look at C++ using a year-old implementation The typical reason is that such implementations are widely available andfree Given a choice, no self-respecting professional would touch such an antique For a novice,older implementations come with serious hidden costs The lack of language features and librarysupport means that the novice must struggle with problems that have been eliminated in newerimplementations Using a feature-poor older implementation also warps the novice’s programming
five-style and gives a biased view of what C++ is The best subset of C++ to initially learn is not the set
of low-level facilities (and not the common C and C++ subset; §1.2) In particular, I recommendrelying on the standard library and on templates to ease learning and to get a good initial impres-sion of what C++ programming can be
The first commercial release of C++ was in late 1985 The language was defined by the firstedition of this book At that point, C++ did not offer multiple inheritance, templates, run-time typeinformation, exceptions, or namespaces Today, I see no reason to use an implementation that
Trang 15Section B.3 Coping with Older C++ Implementations 821
doesn’t provide at least some of these features I added multiple inheritance, templates, and tions to the definition of C++ in 1989 However, early support for templates and exceptions wasuneven and often poor If you find problems with templates or exceptions in an older implementa-tion, consider an immediate upgrade
excep-In general, it is wise to use an implementation that conforms to the standard wherever possibleand to minimize the reliance on implementation-defined and undefined aspects of the language.Design as if the full language were available and then use whatever workarounds are needed Thisleads to better organized and more maintainable programs than designing for the lowest-common-denominator subset of C++ Also, be careful to use implementation-specific language extensionsonly when absolutely necessary
B.3.1 Headers
Traditionally, every header file had a.h h suffix Thus, C++ implementations provided headers such
as<m ma ap p.h h>and<i io os st re ea am m.h h> For compatibility, most still do
When the standards committee needed headers for redefined versions of standard libraries andfor newly added library facilities, naming those headers became a problem Using the old h h
names would have caused compatibility problems The solution was to drop the.h h suffix in
stan-dard header names The suffix is redundant anyway because the< >notation indicates that a dard header is being named
stan-Thus, the standard library provides non-suffixed headers, such as<i io os st re ea am m>and<m ma ap p> The
declarations in those files are placed in namespace s st td d Older headers place their declarations in the
global namespace and use a.h h suffix Consider:
stan-There are no fully-satisfactory approaches to dealing with portability in the face of inconsistentheaders A general idea is to avoid direct dependencies on inconsistent headers and localize theremaining dependencies That is, we try to achieve portability through indirection and localization
Trang 16For example, if declarations that we need are provided in different headers in different systems, wemay choose to #i in nc cl ud de e an application specific header that in turn #i in nc cl ud de es the appropriate
header(s) for each system Similarly, if some functionality is provided in slightly different forms
on different systems, we may choose to access that functionality through application-specific face classes and functions
inter-B.3.2 The Standard Library
Naturally, pre-standard-C++ implementations may lack parts of the standard library Most will
have iostreams, non-templated c co om pl le ex x, a different s st ri in ng g class, and the C standard library ever, some may lack m ma ap p, l li is t, v va al ar rr ra ay y, etc In such cases, use the – typically proprietary –libraries available in a way that will allow conversion when your implementation gets upgraded to
How-the standard It is usually better to use a non-standard s st ri in ng g, l li is t, and m ma ap p than to revert to C-style
programming in the absence of these standard library classes Also, good implementations of theSTL part of the standard library (Chapter 16, Chapter 17, Chapter 18, Chapter 19) are available freefor downloading
Early implementations of the standard library were incomplete For example, some had tainers that didn’t support allocators and others required allocators to be explicitly specified foreach class Similar problems occurred for other ‘‘policy arguments,’’ such as comparison criteria.For example:
con-l li is t<i in t> l li i; / /ok, but some implementations require an allocator
l li is t<i in t,a al ll lo oc at or r<i in t> > l li i2 2; / /ok, but some implementations don’t implement allocators
m ma ap p<s st ri in ng g,R Re ec co or d> m m1 1; / /ok, but some implementations require a less-operation
m ma ap p<s st ri in ng g,R Re ec co or d,l le es ss s<s st in ng g> > m m2 2;
Use whichever version an implementation accepts Eventually, the implementations will accept all
Early C++ implementations provided i is tr rs st re ea am m and o os st rs st re ea am m defined in <s st rs st re ea am m.h h>
instead of i is tr in ng gs st re ea am m and o os st ri in ng gs st re ea am m defined in <s ss st re ea am m> The s st rs st re ea am ms operated directly on a c ch ha ar r[](see §21.10[26])
The streams in pre-standard-C++ implementations were not parameterized In particular, the
templates with the b ba as si ic c_ _prefix are new in the standard, and the b ba as si ic c_ _i io os s class used to be called
struc-In the absence of namespaces, use s st ta ti ic c to compensate for the lack of unnamed namespaces.
Also use an identifying prefix to global names to distinguish your names from those of other parts
of the code For example:
/ /for use on pre-namespace implementations:
c cl la as ss s b bs s_ _s st ri in ng g{ /* */ }; / /Bjarne’s string
t ty yp ed ef f i in t b bs s_ _b bo ol l; / /Bjarne’s Boolean type
Trang 17ba d_ _a al ll lo oc c rather than test for 0 0 In either case, coping with memory exhaustion beyond giving an
error message is hard on many systems
However, when converting from testing 0 0 to catching b ba d_ _a al ll lo oc c is impractical, you can
some-times modify the program to revert to the pre-exception-handling behavior If no_ _n ne ew w_ _h ha nd dl er r is installed, using the n no ot hr ow w allocator will cause a 0 0 to be returned in case of allocation failure:
X X* p p1 1=n ne ew w X X; / /throws bad_alloc if no memory
X X* p p2 2=n ne ew w(n no ot hr ow w) X X; / /returns 0 if no memory
B.3.5 Templates
The standard introduced new template features and clarified the rules for several existing ones
If your implementation doesn’t support partial specialization, use a separate name for the plate that would otherwise have been a specialization For example:
tem-t te em mp pl at te e<c cl la as ss s T T> c cl la as ss s p pl li is t:p pr ri iv va at te e l li is t<v vo oi d*> { / /should have been list<T*>
/ /
};
If your implementation doesn’t support member templates, some techniques become infeasible Inparticular, member templates allow the programmer to specify construction and conversion with aflexibility that cannot be matched without them (§13.6.2) Sometimes, providing a nonmemberfunction that constructs an object is an alternative Consider:
Trang 18tem-functions (§C.13.9.1) The solution is to place the definition of the member tem-functions after theclass declaration For example, rather than
C Co on nt ai ne er r<G Gl lo ob b> c cg g; / /no problem as long as cg.sort() isn’t called
Early implementations of C++ did not handle the use of members defined later in a class Forexample:
m ma ap p<s st ri in ng g,i in t> m m; / /Oops: default template arguments not implemented
m ma ap p< s st in ng g,i in t,l le es ss s<s st ri in ng g> > m m2 2; / /workaround: be explicit
Trang 19Section B.3.6 For-Statement Initializers 825
Such code used to work because in the original definition of C++, the scope of the controlled
vari-able extended to the end of the scope in which the for-statement appears If you find such code, simply declare the controlled variable before the for-statement:
[4] Avoid deprecated features such as global s st ta ti ic cs; also avoid C-style casts; §6.2.7, §B.2.3 [5] ‘‘implicit i in t’’ has been banned, so explicitly specify the type of every function, variable,
c
co on ns st t, etc.; §B.2.2.
[6] When converting a C program to C++, first make sure that function declarations (prototypes)and standard headers are used consistently; §B.2.2
[7] When converting a C program to C++, rename variables that are C++ keywords; §B.2.2
[8] When converting a C program to C++, cast the result of m ma al ll lo oc c()to the proper type or change
all uses of m ma al ll lo oc c()to uses of n ne ew w; §B.2.2.
[9] When converting from m ma al ll lo oc c() and f fr re ee e() to n ne ew w and d de el et e, consider using v ve ct to or r,
Trang 20[11] A facility defined in namespace s st td d is defined in a header without a suffix (e.g s st td d: :c co ou ut t is
declared in<i io os st re ea am m>) Older implementations have standard library facilities in the globalnamespace and declared in headers with a.h h suffix (e.g. : :c co ou ut t declared in<i io os st re ea am m.h h>);
§9.2.2, §B.3.1
[12] If older code tests the result of n ne ew w against 0 0, it must be modified to catch b ba d_ _a al ll lo oc c or to use
n
ne ew w(n no ot hr ow w); §B.3.4
[13] If your implementation doesn’t support default template arguments, provide arguments
explic-itly; t ty yp ed ef fs can often be used to avoid repetition of template arguments (similar to the way the typedef s st ri in ng g saves you from saying b ba as si c_ _s st ri in ng g< c ch ha ar r, c ch ha ar r_ _t tr ra ai it ts s<c ch ha ar r>,
a
al ll lo oc at or r<c ch ha ar r> >); §B.3.5
[14] Use <s st ri in ng g> to get s st td d: :s st ri in ng g (<s st ri in ng g.h h> holds the C-style string functions); §9.2.2,
§B.3.1
[15] For each standard C header <X X.h h> that places names in the global namespace, the header
<c cX X>places the names in namespace s st td d; §B.3.1.
[16] Many systems have a"S St tr in ng g.h h"header defining a string type Note that such strings differ
from the standard library s st ri in ng g.
[17] Prefer standard facilities to non-standard ones; §20.1, §B.3, §C.2
[18] Use e ex xt er rn n"C C"when declaring C functions; §9.2.4
B.5 Exercises
1 (∗2.5) Take a C program and convert it to a C++ program; list the kinds of non-C++ constructsused and determine if they are valid ANSI C constructs First convert the program to strictANSI C (adding prototypes, etc.), then to C++ Estimate the time it would take to convert a100,000 line C program to C++
2 (∗2.5) Write a program to help convert C programs to C++ by renaming variables that are C++
keywords, replacing calls of m ma al ll lo oc c()by uses of n ne ew w, etc Hint: don’t try to do a perfect job.
3 (∗2) Replace all uses of m ma al ll lo oc c()in a C-style C++ program (maybe a recently converted C
pro-gram) to uses of n ne ew w Hint: §B.4[8-9].
4 (∗2.5) Minimize the use of macros, global variables, uninitialized variables, and casts in a style C++ program (maybe a recently converted C program)
C-5 (∗3) Take a C++ program that is the result of a crude conversion from C and critique it as a C++program considering locality of information, abstraction, readability, extensibility, and potentialfor reuse of parts Make one significant change to the program based on that critique
6 (∗2) Take a small (say, 500 line) C++ program and convert it to C Compare the original withthe result for size and probable maintainability
7 (∗3) Write a small set of test programs to determine whether a C++ implementation has ‘‘the
latest’’ standard features For example, what is the scope of a variable defined in a f fo or r
-s
st ta te em en nt t initializer? (§B.3.6), are default template arguments supported? (§B.3.5), are member
templates supported? (§13.6.2), and is argument-based lookup supported? (§8.2.6) Hint:
§B.2.4
8 (∗2.5) Take a C++ program that use <X X.h h>headers and convert it to using <X X> and <c cX X>
headers Minimize the use of using-directives.
Trang 21What the standard promises — character sets — integer literals — constant expressions
— promotions and conversions — multidimensional arrays — fields and unions —memory management — garbage collection — namespaces — access control — pointers
to data members — templates — s st ta ti ic c members — f fr ie en ds s — templates as template parameters — template argument deduction — t ty yp en am me e and t te em mp pl at te e qualification —
instantiation — name binding — templates and namespaces — explicit instantiation —advice
C.1 Introduction and Overview
This chapter presents technical details and examples that do not fit neatly into my presentation ofthe main C++ language features and their uses The details presented here can be important whenyou are writing a program and essential when reading code written using them However, I con-sider them technical details that should not be allowed to distract from the student’s primary task oflearning to use C++ well or the programmer’s primary task of expressing ideas as clearly and asdirectly as possible in C++
C.2 The Standard
Contrary to common belief, strictly adhering to the C++ language and library standard doesn’t antee good code or even portable code The standard doesn’t say whether a piece of code is good
Trang 22guar-or bad; it simply says what a programmer can and cannot rely on from an implementation One canwrite perfectly awful standard-conforming programs, and most real-world programs rely on fea-tures not covered by the standard.
Many important things are deemed implementation-defined by the standard This means that
each implementation must provide a specific, well-defined behavior for a construct and that ior must be documented For example:
behav-u un ns si ig gn ne d c ch ha ar r c c1 1=6 64 4; / /well-defined: a char has at least 8 bits and can always hold 64
u un ns si ig gn ne d c ch ha ar r c c2 2=1 12 56 6; / /implementation-defined: truncation if a char has only 8 bits The initialization of c c1 1 is well-defined because a c ch ha ar r must be at least 8 bits However, the behav- ior of the initialization of c c2 2 is implementation-defined because the number of bits in a c ch ha ar r is implementation-defined If the c ch ha ar r has only 8 bits, the value 1 12 56 6 will be truncated to 2 23 2
(§C.6.2.1) Most implementation-defined features relate to differences in the hardware used to run
a program
When writing real-world programs, it is usually necessary to rely on implementation-definedbehavior Such behavior is the price we pay for the ability to operate effectively on a large range ofsystems For example, the language would have been much simpler if all characters had been 8 bitsand all integers 32 bits However, 16-bit and 32-bit character sets are not uncommon – nor areintegers too large to fit in 32 bits For example, many computers now have disks that hold more
that 3 32 2G G bytes, so 48-bit or 64-bit integers can be useful for representing disk addresses.
To maximize portability, it is wise to be explicit about what implementation-defined features
we rely on and to isolate the more subtle examples in clearly marked sections of a program A cal example of this practice is to present all dependencies on hardware sizes in the form of con-stants and type definitions in some header file To support such techniques, the standard library
typi-provides n nu um me er ri ic c_ _l li im it ts s (§22.2).
Undefined behavior is nastier A construct is deemed undefined by the standard if no
reason-able behavior is required by an implementation Typically, some obvious implementation nique will cause a program using an undefined feature to behave very badly For example:
hard-It is worth spending considerable time and effort to ensure that a program does not use thing deemed undefined by the standard In many cases, tools exist to help do this
Trang 23some-Section C.3 Character Sets 829
C.3 Character Sets
The examples in this book are written using the U.S variant of the international 7-bit character setISO 646-1983 called ASCII (ANSI3.4-1968) This can cause three problems for people who useC++ in an environment with a different character set:
[1] ASCII contains punctuation characters and operator symbols – such as],{, and ! – thatare not available in some character sets
[2] We need a notation for characters that do not have a convenient character representation(e.g., newline and ‘‘the character with value 17’’)
[3] ASCII doesn’t contain characters, such as – ζ , æ , andΠ– that are used for writing guages other than English
lan-C.3.1 Restricted Character Sets
The ASCII special characters [, ], {, }, |, and \ \ occupy character set positions designated as
alphabetic by ISO In most European national ISO-646 character sets, these positions are occupied
by letters not found in the English alphabet For example, the Danish national character set uses
them for the vowels Æ Æ, æ æ, Ø Ø, ø ø, Å Å, and å å No significant amount of text can be written in Danish
without them
A set of trigraphs is provided to allow national characters to be expressed in a portable wayusing a truly standard minimal character set This can be useful for interchange of programs, but itdoesn’t make it easier for people to read programs Naturally, the long-term solution to this prob-lem is for C++ programmers to get equipment that supports both their native language and C++well Unfortunately, this appears to be infeasible for some, and the introduction of new equipmentcan be a frustratingly slow process To help programmers stuck with incomplete character sets,C++ provides alternatives:
´??<´
Some people prefer the keywords such as a an d to their traditional operator notation.
Trang 24Despite their appearance, these are single characters.
It is possible to represent a character as a one-, two-, or three-digit octal number (\ \ followed by octal digits) or as a hexadecimal number (\ \x x followed by hexadecimal digits) There is no limit to
the number of hexadecimal digits in the sequence A sequence of octal or hexadecimal digits is minated by the first character that is not an octal digit or a hexadecimal digit, respectively Forexample:
This makes it possible to represent every character in the machine’s character set and, in particular,
to embed such characters in character strings (see §5.2.2) Using any numeric notation for ters makes a program nonportable across machines with different character sets
charac-It is possible to enclose more than one character in a character literal, for example´a ab b´ Suchuses are archaic, implementation-dependent, and best avoided
When embedding a numeric constant in a string using the octal notation, it is wise always to usethree digits for the number The notation is hard enough to read without having to worry aboutwhether or not the character after a constant is a digit For hexadecimal constants, use two digits.Consider these examples:
c ch ha ar r v v1 1[] = "a a\ \x xa ah h\ 12 9"; / /6 chars: ’a’ ’\xa’ ’h’ ’\12’ ’9’ ’\0’
c ch ha ar r v v2 2[] = "a a\ \x xa ah h\ 12 7"; / /5 chars: ’a’ ’\xa’ ’h’ ’\127’ ’\0’
c ch ha ar r v v3 3[] = "a a\ \x xa ad d\ 12 7"; / /4 chars: ’a’ ’\xad’ ’\127’ ’\0’
c ch ha ar r v v4 4[] = "a a\ \x xa ad d\ 01 27 7"; / /5 chars: ’a’ ’\xad’ ’\012’ ’7’ ’\0’
Trang 25Section C.3.3 Large Character Sets 831
C.3.3 Large Character Sets
A C++ program may be written and presented to the user in character sets that are much richer thanthe 127 character ASCII set Where an implementation supports larger character sets, identifiers,comments, character constants, and strings may contain characters such as å ,β, andΓ However, to
be portable the implementation must map these characters into an encoding using only charactersavailable to every C++ user In principle, this translation into the C++ basic source character set(the set used in this book) occurs before the compiler does any other processing Therefore, it doesnot affect the semantics of the program
The standard encoding of characters from large character sets into the smaller set supporteddirectly by C++ is presented as sequences of four or eight hexadecimal digits:
A programmer can use these character encodings directly However, they are primarily meant
as a way for an implementation that internally uses a small character set to handle characters from alarge character set seen by the programmer
If you rely on special environments to provide an extended character set for use in identifiers,the program becomes less portable A program is hard to read unless you understand the naturallanguage used for identifiers and comments Consequently, for programs used internationally it isusually best to stick to English and ASCII
C.3.4 Signed and Unsigned Characters
It is implementation-defined whether a plain c ch ha ar r is considered signed or unsigned This opens the
possibility for some nasty surprises and implementation dependencies For example:
c ch ha ar r c c=2 25 5; / /255 is ‘‘all ones,’’ hexadecimal 0xFF
i in t i i=c c;
What will be the value of i i? Unfortunately, the answer is undefined On all implementations I know of, the answer depends on the meaning of the ‘‘all ones’’ c ch ha ar r bit pattern when extended into
an i in t On a SGI Challenge machine, a c ch ha ar r is unsigned, so the answer is 2 25 5 On a Sun SPARC
or an IBM PC, where a c ch ar r is signed, the answer is -1 1 In this case, the compiler might warn about the conversion of the literal 2 25 5 to the c ch ha ar r value-1 1 However, C++ does not offer a general mechanism for detecting this kind of problem One solution is to avoid plain c ch ha ar r and use the spe- cific c ch ha ar r types only Unfortunately, some standard library functions, such as s st rc cm mp p(), take plain
c
ch ha ar rs only (§20.4.1).
A c ch ha ar r must behave identically to either a s si ig gn ne d c ch ha ar r or an u un ns si ig gn ne d c ch ha ar r However, the three c ch ha ar r types are distinct, so you can’t mix pointers to different c ch ha ar r types For example:
Trang 26None of these potential problems occurs if you use plain c ch ha ar r throughout.
C.4 Types of Integer Literals
In general, the type of an integer literal depends on its form, value, and suffix:
– If it is decimal and has no suffix, it has the first of these types in which its value can be
rep-resented: i in t, l lo on g i in t, u un ns si ig gn ne d l lo on g i in t.
– If it is octal or hexadecimal and has no suffix, it has the first of these types in which its
value can be represented: i in t, u un ns si ig gn ne d i in t, l lo on g i in t, u un ns si ig gn ne d l lo on g i in t.
– If it is suffixed by u u or U U, its type is the first of these types in which its value can be sented: u un ns si ig gn ne d i in t, u un ns si ig gn ne d l lo on g i in t.
repre-– If it is suffixed by l l or L L, its type is the first of these types in which its value can be sented: l lo on g i in t, u un ns si ig gn ne d l lo on g i in t.
repre-– If it is suffixed by u ul l, l lu u, u uL L, L Lu u, U Ul l, l lU U, U UL L, or L LU U, its type is u un ns si ig gn ne d l lo on g i in t.
For example, 1 10 00 00 0 is of type i in t on a machine with 32-bit i in ts but of type l lo on g i in t on a machine with 16-bit i in ts and 32-bit l lo on gs Similarly, 0 0X XA A0 00 0 is of type i in t on a machine with 32-bit i in ts but of type u un ns si ig gn ne d i in t on a machine with 16-bit i in ts These implementation dependencies can be avoided by using suffixes: 1 10 00 00 0L L is of type l lo on g i in t on all machines and 0 0X XA A0 00 0U U is of type
u
un ns si ig gn ne d i in t on all machines.
Trang 27Section C.5 Constant Expressions 833
C.5 Constant Expressions
In places such as array bounds (§5.2), case labels (§6.3.2), and initializers for enumerators (§4.8),
C++ requires a constant expression A constant expression evaluates to an integral or enumeration
constant Such an expression is composed of literals (§4.3.1, §4.4.1, §4.5.1), enumerators (§4.8),
and c co on ns st ts initialized by constant expressions In a template, an integer template parameter can
also be used (§C.13.3) Floating literals (§4.5.1) can be used only if explicitly converted to an
inte-gral type Functions, class objects, pointers, and references can be used as operands to the s si ze eo of f
operator (§6.2) only
Intuitively, constant expressions are simple expressions that can be evaluated by the compilerbefore the program is linked (§9.1) and starts to run
C.6 Implicit Type Conversion
Integral and floating-point types (§4.1.1) can be mixed freely in assignments and expressions.Wherever possible, values are converted so as not to lose information Unfortunately, value-destroying conversions are also performed implicitly This section provides a description of con-version rules, conversion problems, and their resolution
C.6.1 Promotions
The implicit conversions that preserve values are commonly referred to as promotions Before an arithmetic operation is performed, integral promotion is used to create i in ts out of shorter integer types Note that these promotions will not promote to l lo on g (unless the operand is a w wc ch ha ar r_ _t t or an enumeration that is already larger than an i in t) This reflects the original purpose of these promo-
tions in C: to bring operands to the ‘‘natural’’ size for arithmetic operations
The integral promotions are:
bit-field Otherwise, no integral promotion applies to it
– A b bo ol l is converted to an i in t; f fa ls se e becomes 0 0 and t tr ru ue e becomes 1 1.
Promotions are used as part of the usual arithmetic conversions (§C.6.3)
C.6.2 Conversions
The fundamental types can be converted into each other in a bewildering number of ways In myopinion, too many conversions are allowed For example:
Trang 28qui-C.6.2.1 Integral Conversions
An integer can be converted to another integer type An enumeration value can be converted to aninteger type
If the destination type is u un ns si ig gn ne d, the resulting value is simply as many bits from the source as
will fit in the destination (high-order bits are thrown away if necessary) More precisely, the result
is the least unsigned integer congruent to the source integer modulo 2 2 to the n nth, where n is the
number of bits used to represent the unsigned type For example:
u un ns si ig gn ne d c ch ha ar r u uc c=1 10 23 3; / /binary 1111111111: uc becomes binary 11111111; that is, 255
If the destination type is s si ig gn ne d, the value is unchanged if it can be represented in the destination
type; otherwise, the value is implementation-defined:
s si ig gn ne d c ch ha ar r s sc c=1 10 23 3; / /implementation-defined
Plausible results are 2 25 5 and-1 1 (§C.3.4).
A Boolean or enumeration value can be implicitly converted to its integer equivalent (§4.2,
§4.8)
C.6.2.2 Floating-Point Conversions
A floating-point value can be converted to another floating-point type If the source value can beexactly represented in the destination type, the result is the original numeric value If the sourcevalue is between two adjacent destination values, the result is one of those values Otherwise, thebehavior is undefined For example:
f fl lo oa at t f f=F FL T_ _M MA AX X; / /largest float value
d do ub bl le e d d=f f; / /ok: d == f
f fl lo oa at t f f2 2=d d; / /ok: f2 == f
d do ub bl le e d d3 3=D DB BL L_ _M MA AX X; / /largest double value
f fl lo oa at t f f3 3=d d3 3; / /undefined if FLT_MAX<DBL_MAX
C.6.2.3 Pointer and Reference Conversions
Any pointer to an object type can be implicitly converted to a v vo oi d*(§5.6) A pointer (reference)
to a derived class can be implicitly converted to a pointer (reference) to an accessible and biguous base (§12.2) Note that a pointer to function or a pointer to member cannot be implicitly
unam-converted to a v vo oi d*
Trang 29Section C.6.2.3 Pointer and Reference Conversions 835
A constant expression (§C.5) that evaluates to 0 0 can be implicitly converted to any pointer or
pointer to member type (§5.1.1) For example:
value of i in t(1 1.6 6)is 1 1 The behavior is undefined if the truncated value cannot be represented in
the destination type For example:
i in t i i=2 2.7 7; / /i becomes 2
c ch ha ar r b b=2 20 00 0.7 7; / /undefined for 8-bit chars: 2000 cannot be represented as an 8-bit char
Conversions from integer to floating types are as mathematically correct as the hardware allows.Loss of precision occurs if an integral value cannot be represented exactly as a value of the floatingtype For example,
lo on g i in t to c ch ha ar r However, general compile-time detection is impractical, so the programmer must
be careful When ‘‘being careful’’ isn’t enough, the programmer can insert explicit checks Forexample:
Trang 30To truncate in a way that is guaranteed to be portable requires the use of n nu um me er ri ic c_ _l li im it ts s (§22.2).
C.6.3 Usual Arithmetic Conversions
These conversions are performed on the operands of a binary operator to bring them to a commontype, which is then used as the type of the result:
[1] If either operand is of type l lo on g d do ub bl e, the other is converted to l lo on g d do ub bl e.
– Otherwise, if either operand is d do ub bl le e, the other is converted to d do ub bl e.
– Otherwise, if either operand is f fl lo oa at t, the other is converted to f fl lo oa at t.
– Otherwise, integral promotions (§C.6.1) are performed on both operands
[2] Then, if either operand is u un ns si ig gn ne d l lo on g, the other is converted to u un ns si ig gn ne d l lo on g.
– Otherwise, if one operand is a l lo on g i in t and the other is an u un ns si ig gn ne d i in t, then if a l lo on g i in t can represent all the values of an u un ns si ig gn ne d i in t, the u un ns si ig gn ed d i in t is converted to a l lo on g i in t; otherwise, both operands are converted to u un ns si ig gn ne d l lo on g i in t.
– Otherwise, if either operand is l lo on g, the other is converted to l lo on g.
– Otherwise, if either operand is u un ns si ig gn ne d, the other is converted to u un ns si ig gn ne d.
– Otherwise, both operands are i in t.
C.7 Multidimensional Arrays
It is not uncommon to need a vector of vectors, a vector of vector of vectors, etc The issue is how
to represent these multidimensional vectors in C++ Here, I first show how to use the standard
library v ve ct to or r class Next, I present multidimensional arrays as they appear in C and C++ programs
using only built-in facilities
C.7.1 Vectors
The standard v ve ct to or r (§16.3) provides a very general solution:
v ve ct to or r< v ve ct to or r<i in t> > m m;
This creates a vector of vectors of integers that initially contains no elements We could initialize it
to a three-by-five matrix like this:
Trang 31It is not necessary for the v ve ct or r<i in t>s in the v ve ct or r<v ve ct or r<i in t> >to have the same size
Accessing an element is done by indexing twice For example, m m[i i][j j]is the j jth element of the i ith vector We can print m m like this:
Trang 32i in t m ma a[3 3][5 5] ; / /3 arrays with 5 ints each
For arrays, the dimensions must be given as part of the definition We can initialize m ma a like this:
The array m ma a is simply 15 i in ts s that we access as if it were 3 arrays of 5 i in ts In particular, there is
no single object in memory that is the matrix m ma a – only the elements are stored The dimensions 3 3 and 5 5 exist in the compiler source only When we write code, it is our job to remember them some- how and supply the dimensions where needed For example, we might print m ma a like this:
i in t b ba d[3 3,5 5] ; / /error: comma not allowed in constant expression
i in t g go od d[3 3][5 5] ; / /3 arrays with 5 ints each
i in t o ou uc h=g go od d[1 1,4 4] ; / /error: int initialized by int* (good[1,4] means good[4], which is an int*)
i in t n ni ic ce e=g go od d[1 1][4 4] ;
C.7.3 Passing Multidimensional Arrays
Consider defining a function to manipulate a two-dimensional matrix If the dimensions are known
at compile time, there is no problem:
Trang 33Section C.7.3 Passing Multidimensional Arrays 839
A matrix represented as a multidimensional array is passed as a pointer (rather than copied; §5.3).The first dimension of an array is irrelevant to the problem of finding the location of an element; it
simply states how many elements (here 3 3)of the appropriate type (here i in t[5 5]) are present For
example, look at the previous representation of m ma a and note that by our knowing only that the ond dimension is 5 5, we can locate m ma a[i i][5 5] for any i i The first dimension can therefore be
Trang 34Note the use of&v v[0 0][0 0]for the last call; v v[0 0]would do because it is equivalent, but v v would be
a type error This kind of subtle and messy code is best hidden If you must deal directly with tidimensional arrays, consider encapsulating the code relying on it In that way, you might ease thetask of the next programmer to touch the code Providing a multidimensional array type with aproper subscripting operator saves most users from having to worry about the layout of the data inthe array (§22.4.6)
mul-The standard v ve ct to or r (§16.3) doesn’t suffer from these problems.
C.8 Saving Space
When programming nontrivial applications, there often comes a time when you want more memoryspace than is available or affordable There are two ways of squeezing more space out of what isavailable:
[1] Put more than one small object into a byte
[2] Use the same space to hold different objects at different times
The former can be achieved by using fields, and the latter by using unions These constructs are
described in the following sections Many uses of fields and unions are pure optimizations, andthese optimizations are often based on nonportable assumptions about memory layouts Conse-quently, the programmer should think twice before using them Often, a better approach is tochange the way data is managed, for example, to rely more on dynamically allocated store (§6.2.6)and less on preallocated (static) storage
C.8.1 Fields
It seems extravagant to use a whole byte (a c ch ha ar r or a b bo ol l) to represent a binary variable – for example, an on/off switch – but a c ch ha ar r is the smallest object that can be independently allocated
and addressed in C++ (§5.1) It is possible, however, to bundle several such tiny variables together
as fields in a s st ru uc ct t A member is defined to be a field by specifying the number of bits it is to
occupy Unnamed fields are allowed They do not affect the meaning of the named fields, but theycan be used to make the layout better in some machine-dependent way:
s st ru uc ct t P PP PN N{ / /R6000 Physical Page Number
b
bo ol l field really can be represented by a single bit In an operating system kernel or in a debugger, the type P PP PN N might be used like this:
Trang 35on most machines Programs have been known to shrink significantly when binary variables were
converted from bit fields to characters! Furthermore, it is typically much faster to access a c ch ha ar r or
an i in t than to access a field Fields are simply a convenient shorthand for using bitwise logical
operators (§6.2.4) to extract information from and insert information into part of a word
Trang 36This leaves all code using an E En nt tr ry y unchanged.
Using a u un ni on n so that its value is always read using the member through which it was written is
a pure optimization However, it is not always easy to ensure that a u un ni on n is used in this way only, and subtle errors can be introduced through misuse To avoid errors, one can encapsulate a u un ni on n
so that the correspondence between a type field and access to the u un ni on n members can be guaranteed
Trang 37This is not really a conversion at all On some machines, an i in t and an i in t* do not occupy the
same amount of space, while on others, no integer can have an odd address Such use of a u un ni on n is
dangerous and nonportable, and there is an explicit and portable way of specifying type conversion(§6.2.7)
Unions are occasionally used deliberately to avoid type conversion One might, for example,
use a F Fu ud ge e to find the representation of the pointer 0 0:
C.8.3 Unions and Classes
Many nontrivial u un ni on ns have some members that are much larger than the most frequently-used members Because the size of a u un ni on n is at least as large as its largest member, space is wasted This waste can often be eliminated by using a set of derived classes instead of a u un ni on n.
A class with a constructor, destructor, or copy operation cannot be the type of a u un ni on n member
(§10.4.12) because the compiler would not know which member to destroy
C.9 Memory Management
There are three fundamental ways of using memory in C++:
Static memory, in which an object is allocated by the linker for the duration of the program Global and namespace variables, s st ta ti ic c class members (§10.2.4), and s st ta ti ic c variables in
functions (§7.1.2) are allocated in static memory An object allocated in static memory isconstructed once and persists to the end of the program It always has the same address.Static objects can be a problem in programs using threads (shared-address space concur-rency) because they are shared and require locking for proper access
Automatic memory, in which function arguments and local variables are allocated Each entry
into a function or a block gets its own copy This kind of memory is automatically createdand destroyed; hence the name automatic memory Automatic memory is also said ‘‘to be
on the stack.’’ If you absolutely must be explicit about this, C++ provides the redundant
keyword a au ut o.
Free store, from which memory for objects is explicitly requested by the program and where a program can free memory again once it is done with it (using n ne ew w and d de el et e) When a pro- gram needs more free store, n ne ew w requests it from the operating system Typically, the free
Trang 38store (also called dynamic memory or the heap) grows throughout the lifetime of a program
because no memory is ever returned to the operating system for use by other programs
As far as the programmer is concerned, automatic and static storage are used in simple, obvious,and implicit ways The interesting question is how to manage the free store Allocation (using
n
ne ew w) is simple, but unless we have a consistent policy for giving memory back to the free store
manager, memory will fill up – especially for long-running programs
The simplest strategy is to use automatic objects to manage corresponding objects in free store.Consequently, many containers are implemented as handles to elements stored in the free store
(§25.7) For example, an automatic S St tr in ng g (§11.12) manages a sequence of characters on the free
store and automatically frees that memory when it itself goes out of scope All of the standard tainers (§16.3, Chapter 17, Chapter 20, §22.4) can be conveniently implemented in this way
con-C.9.1 Automatic Garbage Collection
When this regular approach isn’t sufficient, the programmer might use a memory manager thatfinds unreferenced objects and reclaims their memory in which to store new objects This is usu-
ally called automatic garbage collection, or simply garbage collection Naturally, such a memory manager is called a garbage collector.
The fundamental idea of garbage collection is that an object that is no longer referred to in aprogram will not be accessed again, so its memory can be safely reused for some new object Forexample:
The standard does not require that an implementation supply a garbage collector, but garbagecollectors are increasingly used for C++ in areas where their costs compare favorably to those ofmanual management of free store When comparing costs, consider the run time, memory usage,reliability, portability, monetary cost of programming, monetary cost of a garbage collector, andpredictability of performance
Trang 39Section C.9.1.1 Disguised Pointers 845
/ /point #1: no pointer to the int exists here
p
p=r re ei in te er pr et t_ _c ca as st t<i in t*>(i i1 1|i i2 2) ;
/ /now the int is referenced again
}
Often, pointers stored as non-pointers in a program are called ‘‘disguised pointers.’’ In particular,
the pointer originally held in p p is disguised in the integers i i1 1 and i i2 2 However, a garbage collector
need not be concerned about disguised pointers If the garbage collector runs at point#1 1, the ory holding the i in t can be reclaimed In fact, such programs are not guaranteed to work even if a garbage collector is not used because the use of r re ei nt er rp re et t_ _c ca as st t to convert between integers and
mem-pointers is at best implementation-defined
A u un ni on n that can hold both pointers and non-pointers presents a garbage collector with a special problem In general, it is not possible to know whether such a u un ni on n contains a pointer Consider:
u un ni on n U U{ / /union with both pointer and non-pointer members
The safe assumption is that any value that appears in such a u un ni on n is a pointer value A clever
gar-bage collector can do somewhat better For example, it may notice that (for a given
implementa-tion) i in ts are not allocated with odd addresses and that no objects are allocated with an address as low as 8 8 Noticing this will save the garbage collector from having to assume that objects contain- ing locations 9 99 99 99 9 and 8 8 are used by f f()
invokes the destructor for the object pointed to by p p (if any) However, reuse of the memory can be
postponed until it is collected Recycling lots of objects at once can help limit fragmentation(§C.9.1.4) It also renders harmless the otherwise serious mistake of deleting an object twice in theimportant case where the destructor simply deletes memory
As always, access to an object after it has been deleted is undefined
Trang 40C.9.1.3 Destructors
When an object is about to be recycled by a garbage collector, two alternatives exist:
[1] Call the destructor (if any) for the object
[2] Treat the object as raw memory (don’t call its destructor)
By default, a garbage collector should choose option (2) because objects created using n ne ew w and never d de el et ed are never destroyed Thus, one can see a garbage collector as a mechanism for simu-
lating an infinite memory
It is possible to design a garbage collector to invoke the destructors for objects that have beenspecifically ‘‘registered’’ with the collector However, there is no standard way of ‘‘registering’’objects Note that it is always important to destroy objects in an order that ensures that thedestructor for one object doesn’t refer to an object that has been previously destroyed Such order-ing isn’t easily achieved by a garbage collector without help from the programmer
C.9.1.4 Memory Fragmentation
When a lot of objects of varying sizes are allocated and freed, the memory fragments That is,
much of memory is consumed by pieces of memory that are too small to use effectively The son is that a general allocator cannot always find a piece of memory of the exact right size for anobject Using a slightly larger piece means that a smaller fragment of memory remains After run-ning a program for a while with a naive allocator, it is not uncommon to find half the availablememory taken up with fragments too small ever to get reused
rea-Several techniques exist for coping with fragmentation The simplest is to request only largerchunks of memory from the allocator and use each such chunk for objects of the same size (§15.3,
§19.4.2) Because most allocations and deallocations are of small objects of types such as treenodes, links, etc., this technique can be very effective An allocator can sometimes apply similartechniques automatically In either case, fragmentation is further reduced if all of the larger
‘‘chunks’’ are of the same size (say, the size of a page) so that they themselves can be allocated andreallocated without fragmentation
There are two main styles of garbage collectors:
[1] A copying collector moves objects in memory to compact fragmented space.
[2] A conservative collector allocates objects to minimize fragmentation.
From a C++ point of view, conservative collectors are preferable because it is very hard (probablyimpossible in real programs) to move an object and modify all pointers to it correctly A conserva-tive collector also allows C++ code fragments to coexist with code written in languages such as C.Traditionally, copying collectors have been favored by people using languages (such as Lisp andSmalltalk) that deal with objects only indirectly through unique pointers or references However,modern conservative collectors seem to be at least as efficient as copying collectors for larger pro-grams, in which the amount of copying and the interaction between the allocator and a paging sys-tem become important For smaller programs, the ideal of simply never invoking the collector isoften achievable – especially in C++, where many objects are naturally automatic