1. Trang chủ
  2. » Công Nghệ Thông Tin

The C++ Programming Language Third Edition phần 9 pdf

102 1,1K 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề The C++ Programming Language Third Edition
Tác giả Bjarne Stroustrup
Trường học Addison Wesley Longman, Inc.
Chuyên ngành Computer Science
Thể loại sách
Năm xuất bản 1997
Thành phố Boston
Định dạng
Số trang 102
Dung lượng 349,04 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Most will have iostreams, non-templated c co om pl le ex x, a different s st ri in ng g class, and the C standard library.. Older implementations have standard library facilities in the

Trang 1

Section A.7.1 Declarators 807

Trang 2

decl-specifier-seq declarator = assignment-expression

decl-specifier-seq abstract-declarator opt

decl-specifier-seq abstract-declarator opt = assignment-expression

function-definition:

decl-specifier-seq opt declarator ctor-initializer opt function-body

decl-specifier-seq opt declarator function-try-block

A v vo ol at ti le e specifier is a hint to a compiler that an object may change its value in ways not specified

by the language so that aggressive optimizations must be avoided For example, a real time clockmight be declared:

Trang 3

Section A.8 Classes 809

class-head:

class-key identifier opt base-clause opt

class-key nested-name-specifier identifier base-clause opt

class-key nested-name-specifiertemplatetemplate-id base-clause opt

member-declaration member-specification opt

access-specifier : member-specification opt

declarator pure-specifier opt

declarator constant-initializer opt

identifier opt : constant-expression

Constant expressions are defined in §C.5

A.8.1 Derived Classes

See Chapter 12 and Chapter 15

base-clause:

: base-specifier-list

Trang 4

base-specifier

base-specifier-list , base-specifier

base-specifier:

:: opt nested-name-specifier opt class-name

virtual access-specifier opt :: opt nested-name-specifier opt class-name

access-specifier virtual opt :: opt nested-name-specifier opt class-name

access-specifier:

private

protected

public

A.8.2 Special Member Functions

See §11.4 (conversion operators), §10.4.6 (class member initialization), and §12.2.2 (base ization)

Trang 5

Section A.8.3 Overloading 811

operator: one of

+ - * / % ^ & | ~ ! = < > += -= *= /= %= ^= &= |= << >> >>= <<= ==

classidentifier opt

classidentifier opt = type-id

typenameidentifier opt

typenameidentifier opt = type-id

template <template-parameter-list > class identifier opt

template <template-parameter-list > class identifier opt = template-name

Trang 6

f

f< a a>b b>(0 0) ; / /syntax error

f

f< (a a>b b) >(0 0) ; / /ok

A similar lexical ambiguity can occur when terminating>s get too close For example:

l li is t<v ve ct to or r<i in t>> l lv v1 1; / /syntax error: unexpected>>(right shift)

l li is t< v ve ct to or r<i in t> > l lv v2 2; / /correct: list of vectors

Note the space between the two>s;>>is the right-shift operator That can be a real nuisance

A.10 Exception Handling

See §8.3 and Chapter 14

Trang 7

Section A.10 Exception Handling 813

A.11 Preprocessing Directives

The preprocessor is a relatively unsophisticated macro processor that works primarily on lexicaltokens rather than individual characters In addition to the ability to define and use macros (§7.8),the preprocessor provides mechanisms for including text files and standard headers (§9.2.1) andconditional compilation based on macros (§9.3.3) For example:

# if constant-expression new-line group opt

# ifdef identifier new-line group opt

# ifndef identifier new-line group opt

Trang 8

# include pp-tokens new-line

# define identifier replacement-list new-line

# define identifier lparen identifier-list opt ) replacement-list new-line

# undef identifier new-line

# line pp-tokens new-line

# error pp-tokens opt new-line

# pragma pp-tokens opt new-line

Trang 9

You go ahead and follow your customs,

and I´ll follow mine.

– C Napier

C/C++ compatibility — silent differences between C and C++ — C code that is not C++

— deprecated features — C++ code that is not C — coping with older C++ tions — headers — the standard library — namespaces — allocation errors — templates

implementa-— for-statement initializers implementa-— advice implementa-— exercises.

B.1 Introduction

This appendix discusses the incompatibilities between C and C++ and between Standard C++ (asdefined by ISO/IEC 14882) and earlier versions of C++ The purpose is to document differencesthat can cause problems for the programmer and point to ways of dealing with such problems.Most compatibility problems surface when people try to upgrade a C program to a C++ program,try to port a C++ program from one pre-standard version of C++ to another, or try to compile C++using modern features with an older compiler The aim here is not to drown you in the details ofevery compatibility problem that ever surfaced in an implementation, but rather to list the most fre-quently occurring problems and present their standard solutions

When you look at compatibility issues, a key question to consider is the range of tions under which a program needs to work For learning C++, it makes sense to use the most com-plete and helpful implementation For delivering a product, a more conservative strategy might be

implementa-in order to maximize the number of systems on which the product can run In the past, this hasbeen a reason (and sometimes just an excuse) to avoid C++ features deemed novel However,implementations are converging, so the need for portability across platforms is less cause forextreme caution than it was a couple of years ago

Trang 10

B.2 C/C++ Compatibility

With minor exceptions, C++ is a superset of C (meaning C89, defined by ISO/IEC 9899:1990).Most differences stem from C++’s greater emphasis on type checking Well-written C programstend to be C++ programs as well A compiler can diagnose every difference between C++ and C

C++ provides the/ /comments; C does not (although many C implementations provide them as

an extension) This difference can be used to construct programs that behave differently in the twolanguages For example:

C99 (meaning C as defined by ISO/IEC 9899:1999(E)), also provides/ /

A structure name declared in an inner scope can hide the name of an object, function, tor, or type in an outer scope For example:

B.2.2 C Code That Is Not C++

The C/C++ incompatibilities that cause most real problems are not subtle Most are easily caught

by compilers This section gives examples of C code that is not C++ Most are deemed poor style

or even obsolete in modern C

In C, most functions can be called without a previous declaration For example:

m ma ai n() /* poor style C Not C++*/

Trang 11

recom-Section B.2.2 C Code That Is Not C++ 817

options to enforce it, C code conforms to the C++ rule Where undeclared functions are called, youhave to know the functions and the rules for C pretty well to know whether you have made a mis-

take or introduced a portability problem For example, the previous m ma ai n()contains at least twoerrors as a C program

In C, a function declared without specifying any argument types can take any number of ments of any type at all Such use is deemed obsolescent in Standard C, but it is not uncommon:

argu-v vo oi d f f() ; /*argument types not mentioned*/

v vo oi d f f(a a,p p,c c) c ch ha ar r*p p; c ch ha ar r c c; { /* */ } /*C Not C++*/

Such definitions must be rewritten:

v vo oi d f f(i in t a a, c ch ha ar r* p p, c ch ha ar r c c) { /* */ }

In C and in pre-standard versions of C++, the type specifier defaults to i in t For example:

c co ns st t a a=7 7; /* In C, type int assumed Not C++*/

C99 disallows ‘‘implicit i in t,’’ just as in C++.

C allows the definition of s st ru uc ct ts in return type and argument type declarations For example:

s st ru uc ct t S S{i in t x x,y y; } f f() ; /*C Not C++*/

v vo oi d g g(s st ru uc ct t S S{i in t x x,y y; } y y) ; /*C Not C++*/

The C++ rules for defining types make such declarations useless, and they are not allowed

In C, integers can be assigned to variables of enumeration type:

e en nu um m D Di ir re ct ti io on n{u up p, d do ow n};

e en nu um m D Di ir re ct ti io on n d d=1 1; /* error: int assigned to Direction; ok in C*/

C++ provides many more keywords than C does If one of these appears as an identifier in a C gram, that program must be modified to make it a C++ program:

Trang 12

In C, some of the C++ keywords are macros defined in standard headers:

This implies that in C they can be tested using#i if fd de ef f, redefined, etc.

In C, a global data object may be declared several times in a single translation unit without

using the e ex xt er rn n specifier As long as at most one such declaration provides an initializer, the

object is considered defined only once For example:

i in t i i; i in t i i; /* defines or declares a single integer ‘i’; not C++*/

In C++, an entity must be defined exactly once; §9.2.3

In C++, a class may not have the same name as a t ty yp ed ef f declared to refer to a different type in

the same scope; §5.7

In C, a v vo oi d*may be used as the right-hand operand of an assignment to or initialization of avariable of any pointer type; in C++ it may not (§5.6) For example:

C allows transfer of control to a labeled-statement (§A.6) to bypass an initialization; C++ does not.

In C, a global c co ns st t by default has external linkage; in C++ it does not and must be initialized, unless explicitly declared e ex xt er rn n (§5.4).

In C, names of nested structures are placed in the same scope as the structure in which they arenested For example:

The keyword s st ta ti ic c, which usually means ‘‘statically allocated,’’ can be used to indicate that a

function or an object is local to a translation unit For example:

Trang 13

Section B.2.3 Deprecated Features 819

This program genuinely has two integers called g gl ob b Each g gl ob b is used exclusively by functions

defined in its translation unit

The use of s st ta ti ic c to indicate ‘‘local to translation unit’’ is deprecated in C++ Use unnamed

namespaces instead (§8.2.5.1)

The implicit conversion of a string literal to a (non-c co on ns st t) c ch ha ar r*is deprecated Use named

arrays of c ch ha ar r or avoid assignment of string literals to c ch ha ar r*s (§5.2.2)

C-style casts should have been deprecated when the new-style casts were introduced mers should seriously consider banning C-style casts from their own programs Where explicit

Program-type conversion is necessary, s st ta ti ic c_ _c ca as st t, r re ei nt er rp re et t_ _c ca as st t, c co on ns st t_ _c ca as st t, or a combination of these

can do what a C-style cast can The new-style casts should be preferred because they are moreexplicit and more visible (§6.2.7)

B.2.4 C++ Code That Is Not C

This section lists facilities offered by C++ but not by C The features are sorted by purpose ever, many classifications are possible and most features serve multiple purposes, so this classifica-tion should not be taken too seriously

How-– Features primarily for notational convenience:

[1] / /comments (§2.3); added to C99

[2] Support for restricted character sets (§C.3.1); partially added to C99

[3] Support for extended character sets (§C.3.3); added to C99

[4] Non-constant initializers for objects in s st ta ti ic c storage (§9.4.1)

[5] c co on ns st t in constant expressions (§5.4, §C.5)

[6] Declarations as statements (§6.3.1); added to C99

[7] Declarations in for-statement initializers (§6.3.3); added to C99

[8] Declarations in conditions (§6.3.2.1)

[9] Structure names need not be prefixed by s st ru uc ct t (§5.7)

– Features primarily for strengthening the type system:

[1] Function argument type checking (§7.1); later added to C (§B.2.2)

[2] Type-safe linkage (§9.2, §9.2.3)

[3] Free store management using n ne ew w and d de el et e (§6.2.6, §10.4.5, §15.6)

[4] c co on ns st t (§5.4, §5.4.1); later added to C

[5] The Boolean type b bo ol l (§4.2); partially added to C99

[6] New cast syntax (§6.2.7)

– Facilities for user-defined types:

[1] Classes (Chapter 10)

[2] Member functions (§10.2.1) and member classes (§11.12)

[3] Constructors and destructors (§10.2.3, §10.4.1)

[4] Derived classes (Chapter 12, Chapter 15)

Trang 14

[5] v vi rt tu ua al l functions and abstract classes (§12.2.6, §12.3)

[6] Public/protected/private access control (§10.2.2, §15.3, §C.11)

[6] Explicit scope qualification (operator: :; §4.9.4)

[7] Exception handling (§8.3, Chapter 14)

[8] Run-time Type Identification (§15.4)

The keywords added by C++ (§B.2.2) can be used to spot most C++-specific facilities However,

some facilities, such as function overloading and c co on ns st ts in constant expressions, are not identified

by a keyword In addition to the features listed, the C++ library (§16.1.2) is mostly C++ specific.The_ _c cp pl us pl us s macro can be used to determine whether a program is being processed by a C

or a C++ compiler (§9.2.4)

B.3 Coping with Older C++ Implementations

C++ has been in constant use since 1983 (§1.4) Since then, several versions have been defined andmany separately developed implementations have emerged The fundamental aim of the standardseffort was to ensure that implementers and users would have a single definition of C++ to workfrom Until that definition becomes pervasive in the C++ community, however, we have to dealwith the fact that not every implementation provides every feature described in this book

It is unfortunately not uncommon for people to take their first serious look at C++ using a year-old implementation The typical reason is that such implementations are widely available andfree Given a choice, no self-respecting professional would touch such an antique For a novice,older implementations come with serious hidden costs The lack of language features and librarysupport means that the novice must struggle with problems that have been eliminated in newerimplementations Using a feature-poor older implementation also warps the novice’s programming

five-style and gives a biased view of what C++ is The best subset of C++ to initially learn is not the set

of low-level facilities (and not the common C and C++ subset; §1.2) In particular, I recommendrelying on the standard library and on templates to ease learning and to get a good initial impres-sion of what C++ programming can be

The first commercial release of C++ was in late 1985 The language was defined by the firstedition of this book At that point, C++ did not offer multiple inheritance, templates, run-time typeinformation, exceptions, or namespaces Today, I see no reason to use an implementation that

Trang 15

Section B.3 Coping with Older C++ Implementations 821

doesn’t provide at least some of these features I added multiple inheritance, templates, and tions to the definition of C++ in 1989 However, early support for templates and exceptions wasuneven and often poor If you find problems with templates or exceptions in an older implementa-tion, consider an immediate upgrade

excep-In general, it is wise to use an implementation that conforms to the standard wherever possibleand to minimize the reliance on implementation-defined and undefined aspects of the language.Design as if the full language were available and then use whatever workarounds are needed Thisleads to better organized and more maintainable programs than designing for the lowest-common-denominator subset of C++ Also, be careful to use implementation-specific language extensionsonly when absolutely necessary

B.3.1 Headers

Traditionally, every header file had a.h h suffix Thus, C++ implementations provided headers such

as<m ma ap p.h h>and<i io os st re ea am m.h h> For compatibility, most still do

When the standards committee needed headers for redefined versions of standard libraries andfor newly added library facilities, naming those headers became a problem Using the old h h

names would have caused compatibility problems The solution was to drop the.h h suffix in

stan-dard header names The suffix is redundant anyway because the< >notation indicates that a dard header is being named

stan-Thus, the standard library provides non-suffixed headers, such as<i io os st re ea am m>and<m ma ap p> The

declarations in those files are placed in namespace s st td d Older headers place their declarations in the

global namespace and use a.h h suffix Consider:

stan-There are no fully-satisfactory approaches to dealing with portability in the face of inconsistentheaders A general idea is to avoid direct dependencies on inconsistent headers and localize theremaining dependencies That is, we try to achieve portability through indirection and localization

Trang 16

For example, if declarations that we need are provided in different headers in different systems, wemay choose to #i in nc cl ud de e an application specific header that in turn #i in nc cl ud de es the appropriate

header(s) for each system Similarly, if some functionality is provided in slightly different forms

on different systems, we may choose to access that functionality through application-specific face classes and functions

inter-B.3.2 The Standard Library

Naturally, pre-standard-C++ implementations may lack parts of the standard library Most will

have iostreams, non-templated c co om pl le ex x, a different s st ri in ng g class, and the C standard library ever, some may lack m ma ap p, l li is t, v va al ar rr ra ay y, etc In such cases, use the – typically proprietary –libraries available in a way that will allow conversion when your implementation gets upgraded to

How-the standard It is usually better to use a non-standard s st ri in ng g, l li is t, and m ma ap p than to revert to C-style

programming in the absence of these standard library classes Also, good implementations of theSTL part of the standard library (Chapter 16, Chapter 17, Chapter 18, Chapter 19) are available freefor downloading

Early implementations of the standard library were incomplete For example, some had tainers that didn’t support allocators and others required allocators to be explicitly specified foreach class Similar problems occurred for other ‘‘policy arguments,’’ such as comparison criteria.For example:

con-l li is t<i in t> l li i; / /ok, but some implementations require an allocator

l li is t<i in t,a al ll lo oc at or r<i in t> > l li i2 2; / /ok, but some implementations don’t implement allocators

m ma ap p<s st ri in ng g,R Re ec co or d> m m1 1; / /ok, but some implementations require a less-operation

m ma ap p<s st ri in ng g,R Re ec co or d,l le es ss s<s st in ng g> > m m2 2;

Use whichever version an implementation accepts Eventually, the implementations will accept all

Early C++ implementations provided i is tr rs st re ea am m and o os st rs st re ea am m defined in <s st rs st re ea am m.h h>

instead of i is tr in ng gs st re ea am m and o os st ri in ng gs st re ea am m defined in <s ss st re ea am m> The s st rs st re ea am ms operated directly on a c ch ha ar r[](see §21.10[26])

The streams in pre-standard-C++ implementations were not parameterized In particular, the

templates with the b ba as si ic c_ _prefix are new in the standard, and the b ba as si ic c_ _i io os s class used to be called

struc-In the absence of namespaces, use s st ta ti ic c to compensate for the lack of unnamed namespaces.

Also use an identifying prefix to global names to distinguish your names from those of other parts

of the code For example:

/ /for use on pre-namespace implementations:

c cl la as ss s b bs s_ _s st ri in ng g{ /* */ }; / /Bjarne’s string

t ty yp ed ef f i in t b bs s_ _b bo ol l; / /Bjarne’s Boolean type

Trang 17

ba d_ _a al ll lo oc c rather than test for 0 0 In either case, coping with memory exhaustion beyond giving an

error message is hard on many systems

However, when converting from testing 0 0 to catching b ba d_ _a al ll lo oc c is impractical, you can

some-times modify the program to revert to the pre-exception-handling behavior If no_ _n ne ew w_ _h ha nd dl er r is installed, using the n no ot hr ow w allocator will cause a 0 0 to be returned in case of allocation failure:

X X* p p1 1=n ne ew w X X; / /throws bad_alloc if no memory

X X* p p2 2=n ne ew w(n no ot hr ow w) X X; / /returns 0 if no memory

B.3.5 Templates

The standard introduced new template features and clarified the rules for several existing ones

If your implementation doesn’t support partial specialization, use a separate name for the plate that would otherwise have been a specialization For example:

tem-t te em mp pl at te e<c cl la as ss s T T> c cl la as ss s p pl li is t:p pr ri iv va at te e l li is t<v vo oi d*> { / /should have been list<T*>

/ /

};

If your implementation doesn’t support member templates, some techniques become infeasible Inparticular, member templates allow the programmer to specify construction and conversion with aflexibility that cannot be matched without them (§13.6.2) Sometimes, providing a nonmemberfunction that constructs an object is an alternative Consider:

Trang 18

tem-functions (§C.13.9.1) The solution is to place the definition of the member tem-functions after theclass declaration For example, rather than

C Co on nt ai ne er r<G Gl lo ob b> c cg g; / /no problem as long as cg.sort() isn’t called

Early implementations of C++ did not handle the use of members defined later in a class Forexample:

m ma ap p<s st ri in ng g,i in t> m m; / /Oops: default template arguments not implemented

m ma ap p< s st in ng g,i in t,l le es ss s<s st ri in ng g> > m m2 2; / /workaround: be explicit

Trang 19

Section B.3.6 For-Statement Initializers 825

Such code used to work because in the original definition of C++, the scope of the controlled

vari-able extended to the end of the scope in which the for-statement appears If you find such code, simply declare the controlled variable before the for-statement:

[4] Avoid deprecated features such as global s st ta ti ic cs; also avoid C-style casts; §6.2.7, §B.2.3 [5] ‘‘implicit i in t’’ has been banned, so explicitly specify the type of every function, variable,

c

co on ns st t, etc.; §B.2.2.

[6] When converting a C program to C++, first make sure that function declarations (prototypes)and standard headers are used consistently; §B.2.2

[7] When converting a C program to C++, rename variables that are C++ keywords; §B.2.2

[8] When converting a C program to C++, cast the result of m ma al ll lo oc c()to the proper type or change

all uses of m ma al ll lo oc c()to uses of n ne ew w; §B.2.2.

[9] When converting from m ma al ll lo oc c() and f fr re ee e() to n ne ew w and d de el et e, consider using v ve ct to or r,

Trang 20

[11] A facility defined in namespace s st td d is defined in a header without a suffix (e.g s st td d: :c co ou ut t is

declared in<i io os st re ea am m>) Older implementations have standard library facilities in the globalnamespace and declared in headers with a.h h suffix (e.g. : :c co ou ut t declared in<i io os st re ea am m.h h>);

§9.2.2, §B.3.1

[12] If older code tests the result of n ne ew w against 0 0, it must be modified to catch b ba d_ _a al ll lo oc c or to use

n

ne ew w(n no ot hr ow w); §B.3.4

[13] If your implementation doesn’t support default template arguments, provide arguments

explic-itly; t ty yp ed ef fs can often be used to avoid repetition of template arguments (similar to the way the typedef s st ri in ng g saves you from saying b ba as si c_ _s st ri in ng g< c ch ha ar r, c ch ha ar r_ _t tr ra ai it ts s<c ch ha ar r>,

a

al ll lo oc at or r<c ch ha ar r> >); §B.3.5

[14] Use <s st ri in ng g> to get s st td d: :s st ri in ng g (<s st ri in ng g.h h> holds the C-style string functions); §9.2.2,

§B.3.1

[15] For each standard C header <X X.h h> that places names in the global namespace, the header

<c cX X>places the names in namespace s st td d; §B.3.1.

[16] Many systems have a"S St tr in ng g.h h"header defining a string type Note that such strings differ

from the standard library s st ri in ng g.

[17] Prefer standard facilities to non-standard ones; §20.1, §B.3, §C.2

[18] Use e ex xt er rn n"C C"when declaring C functions; §9.2.4

B.5 Exercises

1 (∗2.5) Take a C program and convert it to a C++ program; list the kinds of non-C++ constructsused and determine if they are valid ANSI C constructs First convert the program to strictANSI C (adding prototypes, etc.), then to C++ Estimate the time it would take to convert a100,000 line C program to C++

2 (∗2.5) Write a program to help convert C programs to C++ by renaming variables that are C++

keywords, replacing calls of m ma al ll lo oc c()by uses of n ne ew w, etc Hint: don’t try to do a perfect job.

3 (∗2) Replace all uses of m ma al ll lo oc c()in a C-style C++ program (maybe a recently converted C

pro-gram) to uses of n ne ew w Hint: §B.4[8-9].

4 (∗2.5) Minimize the use of macros, global variables, uninitialized variables, and casts in a style C++ program (maybe a recently converted C program)

C-5 (∗3) Take a C++ program that is the result of a crude conversion from C and critique it as a C++program considering locality of information, abstraction, readability, extensibility, and potentialfor reuse of parts Make one significant change to the program based on that critique

6 (∗2) Take a small (say, 500 line) C++ program and convert it to C Compare the original withthe result for size and probable maintainability

7 (∗3) Write a small set of test programs to determine whether a C++ implementation has ‘‘the

latest’’ standard features For example, what is the scope of a variable defined in a f fo or r

-s

st ta te em en nt t initializer? (§B.3.6), are default template arguments supported? (§B.3.5), are member

templates supported? (§13.6.2), and is argument-based lookup supported? (§8.2.6) Hint:

§B.2.4

8 (∗2.5) Take a C++ program that use <X X.h h>headers and convert it to using <X X> and <c cX X>

headers Minimize the use of using-directives.

Trang 21

What the standard promises — character sets — integer literals — constant expressions

— promotions and conversions — multidimensional arrays — fields and unions —memory management — garbage collection — namespaces — access control — pointers

to data members — templates — s st ta ti ic c members — f fr ie en ds s — templates as template parameters — template argument deduction — t ty yp en am me e and t te em mp pl at te e qualification —

instantiation — name binding — templates and namespaces — explicit instantiation —advice

C.1 Introduction and Overview

This chapter presents technical details and examples that do not fit neatly into my presentation ofthe main C++ language features and their uses The details presented here can be important whenyou are writing a program and essential when reading code written using them However, I con-sider them technical details that should not be allowed to distract from the student’s primary task oflearning to use C++ well or the programmer’s primary task of expressing ideas as clearly and asdirectly as possible in C++

C.2 The Standard

Contrary to common belief, strictly adhering to the C++ language and library standard doesn’t antee good code or even portable code The standard doesn’t say whether a piece of code is good

Trang 22

guar-or bad; it simply says what a programmer can and cannot rely on from an implementation One canwrite perfectly awful standard-conforming programs, and most real-world programs rely on fea-tures not covered by the standard.

Many important things are deemed implementation-defined by the standard This means that

each implementation must provide a specific, well-defined behavior for a construct and that ior must be documented For example:

behav-u un ns si ig gn ne d c ch ha ar r c c1 1=6 64 4; / /well-defined: a char has at least 8 bits and can always hold 64

u un ns si ig gn ne d c ch ha ar r c c2 2=1 12 56 6; / /implementation-defined: truncation if a char has only 8 bits The initialization of c c1 1 is well-defined because a c ch ha ar r must be at least 8 bits However, the behav- ior of the initialization of c c2 2 is implementation-defined because the number of bits in a c ch ha ar r is implementation-defined If the c ch ha ar r has only 8 bits, the value 1 12 56 6 will be truncated to 2 23 2

(§C.6.2.1) Most implementation-defined features relate to differences in the hardware used to run

a program

When writing real-world programs, it is usually necessary to rely on implementation-definedbehavior Such behavior is the price we pay for the ability to operate effectively on a large range ofsystems For example, the language would have been much simpler if all characters had been 8 bitsand all integers 32 bits However, 16-bit and 32-bit character sets are not uncommon – nor areintegers too large to fit in 32 bits For example, many computers now have disks that hold more

that 3 32 2G G bytes, so 48-bit or 64-bit integers can be useful for representing disk addresses.

To maximize portability, it is wise to be explicit about what implementation-defined features

we rely on and to isolate the more subtle examples in clearly marked sections of a program A cal example of this practice is to present all dependencies on hardware sizes in the form of con-stants and type definitions in some header file To support such techniques, the standard library

typi-provides n nu um me er ri ic c_ _l li im it ts s (§22.2).

Undefined behavior is nastier A construct is deemed undefined by the standard if no

reason-able behavior is required by an implementation Typically, some obvious implementation nique will cause a program using an undefined feature to behave very badly For example:

hard-It is worth spending considerable time and effort to ensure that a program does not use thing deemed undefined by the standard In many cases, tools exist to help do this

Trang 23

some-Section C.3 Character Sets 829

C.3 Character Sets

The examples in this book are written using the U.S variant of the international 7-bit character setISO 646-1983 called ASCII (ANSI3.4-1968) This can cause three problems for people who useC++ in an environment with a different character set:

[1] ASCII contains punctuation characters and operator symbols – such as],{, and ! – thatare not available in some character sets

[2] We need a notation for characters that do not have a convenient character representation(e.g., newline and ‘‘the character with value 17’’)

[3] ASCII doesn’t contain characters, such as – ζ , æ , andΠ– that are used for writing guages other than English

lan-C.3.1 Restricted Character Sets

The ASCII special characters [, ], {, }, |, and \ \ occupy character set positions designated as

alphabetic by ISO In most European national ISO-646 character sets, these positions are occupied

by letters not found in the English alphabet For example, the Danish national character set uses

them for the vowels Æ Æ, æ æ, Ø Ø, ø ø, Å Å, and å å No significant amount of text can be written in Danish

without them

A set of trigraphs is provided to allow national characters to be expressed in a portable wayusing a truly standard minimal character set This can be useful for interchange of programs, but itdoesn’t make it easier for people to read programs Naturally, the long-term solution to this prob-lem is for C++ programmers to get equipment that supports both their native language and C++well Unfortunately, this appears to be infeasible for some, and the introduction of new equipmentcan be a frustratingly slow process To help programmers stuck with incomplete character sets,C++ provides alternatives:

´??<´

Some people prefer the keywords such as a an d to their traditional operator notation.

Trang 24

Despite their appearance, these are single characters.

It is possible to represent a character as a one-, two-, or three-digit octal number (\ \ followed by octal digits) or as a hexadecimal number (\ \x x followed by hexadecimal digits) There is no limit to

the number of hexadecimal digits in the sequence A sequence of octal or hexadecimal digits is minated by the first character that is not an octal digit or a hexadecimal digit, respectively Forexample:

This makes it possible to represent every character in the machine’s character set and, in particular,

to embed such characters in character strings (see §5.2.2) Using any numeric notation for ters makes a program nonportable across machines with different character sets

charac-It is possible to enclose more than one character in a character literal, for example´a ab b´ Suchuses are archaic, implementation-dependent, and best avoided

When embedding a numeric constant in a string using the octal notation, it is wise always to usethree digits for the number The notation is hard enough to read without having to worry aboutwhether or not the character after a constant is a digit For hexadecimal constants, use two digits.Consider these examples:

c ch ha ar r v v1 1[] = "a a\ \x xa ah h\ 12 9"; / /6 chars: ’a’ ’\xa’ ’h’ ’\12’ ’9’ ’\0’

c ch ha ar r v v2 2[] = "a a\ \x xa ah h\ 12 7"; / /5 chars: ’a’ ’\xa’ ’h’ ’\127’ ’\0’

c ch ha ar r v v3 3[] = "a a\ \x xa ad d\ 12 7"; / /4 chars: ’a’ ’\xad’ ’\127’ ’\0’

c ch ha ar r v v4 4[] = "a a\ \x xa ad d\ 01 27 7"; / /5 chars: ’a’ ’\xad’ ’\012’ ’7’ ’\0’

Trang 25

Section C.3.3 Large Character Sets 831

C.3.3 Large Character Sets

A C++ program may be written and presented to the user in character sets that are much richer thanthe 127 character ASCII set Where an implementation supports larger character sets, identifiers,comments, character constants, and strings may contain characters such as å ,β, andΓ However, to

be portable the implementation must map these characters into an encoding using only charactersavailable to every C++ user In principle, this translation into the C++ basic source character set(the set used in this book) occurs before the compiler does any other processing Therefore, it doesnot affect the semantics of the program

The standard encoding of characters from large character sets into the smaller set supporteddirectly by C++ is presented as sequences of four or eight hexadecimal digits:

A programmer can use these character encodings directly However, they are primarily meant

as a way for an implementation that internally uses a small character set to handle characters from alarge character set seen by the programmer

If you rely on special environments to provide an extended character set for use in identifiers,the program becomes less portable A program is hard to read unless you understand the naturallanguage used for identifiers and comments Consequently, for programs used internationally it isusually best to stick to English and ASCII

C.3.4 Signed and Unsigned Characters

It is implementation-defined whether a plain c ch ha ar r is considered signed or unsigned This opens the

possibility for some nasty surprises and implementation dependencies For example:

c ch ha ar r c c=2 25 5; / /255 is ‘‘all ones,’’ hexadecimal 0xFF

i in t i i=c c;

What will be the value of i i? Unfortunately, the answer is undefined On all implementations I know of, the answer depends on the meaning of the ‘‘all ones’’ c ch ha ar r bit pattern when extended into

an i in t On a SGI Challenge machine, a c ch ha ar r is unsigned, so the answer is 2 25 5 On a Sun SPARC

or an IBM PC, where a c ch ar r is signed, the answer is -1 1 In this case, the compiler might warn about the conversion of the literal 2 25 5 to the c ch ha ar r value-1 1 However, C++ does not offer a general mechanism for detecting this kind of problem One solution is to avoid plain c ch ha ar r and use the spe- cific c ch ha ar r types only Unfortunately, some standard library functions, such as s st rc cm mp p(), take plain

c

ch ha ar rs only (§20.4.1).

A c ch ha ar r must behave identically to either a s si ig gn ne d c ch ha ar r or an u un ns si ig gn ne d c ch ha ar r However, the three c ch ha ar r types are distinct, so you can’t mix pointers to different c ch ha ar r types For example:

Trang 26

None of these potential problems occurs if you use plain c ch ha ar r throughout.

C.4 Types of Integer Literals

In general, the type of an integer literal depends on its form, value, and suffix:

– If it is decimal and has no suffix, it has the first of these types in which its value can be

rep-resented: i in t, l lo on g i in t, u un ns si ig gn ne d l lo on g i in t.

– If it is octal or hexadecimal and has no suffix, it has the first of these types in which its

value can be represented: i in t, u un ns si ig gn ne d i in t, l lo on g i in t, u un ns si ig gn ne d l lo on g i in t.

If it is suffixed by u u or U U, its type is the first of these types in which its value can be sented: u un ns si ig gn ne d i in t, u un ns si ig gn ne d l lo on g i in t.

repre-– If it is suffixed by l l or L L, its type is the first of these types in which its value can be sented: l lo on g i in t, u un ns si ig gn ne d l lo on g i in t.

repre-– If it is suffixed by u ul l, l lu u, u uL L, L Lu u, U Ul l, l lU U, U UL L, or L LU U, its type is u un ns si ig gn ne d l lo on g i in t.

For example, 1 10 00 00 0 is of type i in t on a machine with 32-bit i in ts but of type l lo on g i in t on a machine with 16-bit i in ts and 32-bit l lo on gs Similarly, 0 0X XA A0 00 0 is of type i in t on a machine with 32-bit i in ts but of type u un ns si ig gn ne d i in t on a machine with 16-bit i in ts These implementation dependencies can be avoided by using suffixes: 1 10 00 00 0L L is of type l lo on g i in t on all machines and 0 0X XA A0 00 0U U is of type

u

un ns si ig gn ne d i in t on all machines.

Trang 27

Section C.5 Constant Expressions 833

C.5 Constant Expressions

In places such as array bounds (§5.2), case labels (§6.3.2), and initializers for enumerators (§4.8),

C++ requires a constant expression A constant expression evaluates to an integral or enumeration

constant Such an expression is composed of literals (§4.3.1, §4.4.1, §4.5.1), enumerators (§4.8),

and c co on ns st ts initialized by constant expressions In a template, an integer template parameter can

also be used (§C.13.3) Floating literals (§4.5.1) can be used only if explicitly converted to an

inte-gral type Functions, class objects, pointers, and references can be used as operands to the s si ze eo of f

operator (§6.2) only

Intuitively, constant expressions are simple expressions that can be evaluated by the compilerbefore the program is linked (§9.1) and starts to run

C.6 Implicit Type Conversion

Integral and floating-point types (§4.1.1) can be mixed freely in assignments and expressions.Wherever possible, values are converted so as not to lose information Unfortunately, value-destroying conversions are also performed implicitly This section provides a description of con-version rules, conversion problems, and their resolution

C.6.1 Promotions

The implicit conversions that preserve values are commonly referred to as promotions Before an arithmetic operation is performed, integral promotion is used to create i in ts out of shorter integer types Note that these promotions will not promote to l lo on g (unless the operand is a w wc ch ha ar r_ _t t or an enumeration that is already larger than an i in t) This reflects the original purpose of these promo-

tions in C: to bring operands to the ‘‘natural’’ size for arithmetic operations

The integral promotions are:

bit-field Otherwise, no integral promotion applies to it

A b bo ol l is converted to an i in t; f fa ls se e becomes 0 0 and t tr ru ue e becomes 1 1.

Promotions are used as part of the usual arithmetic conversions (§C.6.3)

C.6.2 Conversions

The fundamental types can be converted into each other in a bewildering number of ways In myopinion, too many conversions are allowed For example:

Trang 28

qui-C.6.2.1 Integral Conversions

An integer can be converted to another integer type An enumeration value can be converted to aninteger type

If the destination type is u un ns si ig gn ne d, the resulting value is simply as many bits from the source as

will fit in the destination (high-order bits are thrown away if necessary) More precisely, the result

is the least unsigned integer congruent to the source integer modulo 2 2 to the n nth, where n is the

number of bits used to represent the unsigned type For example:

u un ns si ig gn ne d c ch ha ar r u uc c=1 10 23 3; / /binary 1111111111: uc becomes binary 11111111; that is, 255

If the destination type is s si ig gn ne d, the value is unchanged if it can be represented in the destination

type; otherwise, the value is implementation-defined:

s si ig gn ne d c ch ha ar r s sc c=1 10 23 3; / /implementation-defined

Plausible results are 2 25 5 and-1 1 (§C.3.4).

A Boolean or enumeration value can be implicitly converted to its integer equivalent (§4.2,

§4.8)

C.6.2.2 Floating-Point Conversions

A floating-point value can be converted to another floating-point type If the source value can beexactly represented in the destination type, the result is the original numeric value If the sourcevalue is between two adjacent destination values, the result is one of those values Otherwise, thebehavior is undefined For example:

f fl lo oa at t f f=F FL T_ _M MA AX X; / /largest float value

d do ub bl le e d d=f f; / /ok: d == f

f fl lo oa at t f f2 2=d d; / /ok: f2 == f

d do ub bl le e d d3 3=D DB BL L_ _M MA AX X; / /largest double value

f fl lo oa at t f f3 3=d d3 3; / /undefined if FLT_MAX<DBL_MAX

C.6.2.3 Pointer and Reference Conversions

Any pointer to an object type can be implicitly converted to a v vo oi d*(§5.6) A pointer (reference)

to a derived class can be implicitly converted to a pointer (reference) to an accessible and biguous base (§12.2) Note that a pointer to function or a pointer to member cannot be implicitly

unam-converted to a v vo oi d*

Trang 29

Section C.6.2.3 Pointer and Reference Conversions 835

A constant expression (§C.5) that evaluates to 0 0 can be implicitly converted to any pointer or

pointer to member type (§5.1.1) For example:

value of i in t(1 1.6 6)is 1 1 The behavior is undefined if the truncated value cannot be represented in

the destination type For example:

i in t i i=2 2.7 7; / /i becomes 2

c ch ha ar r b b=2 20 00 0.7 7; / /undefined for 8-bit chars: 2000 cannot be represented as an 8-bit char

Conversions from integer to floating types are as mathematically correct as the hardware allows.Loss of precision occurs if an integral value cannot be represented exactly as a value of the floatingtype For example,

lo on g i in t to c ch ha ar r However, general compile-time detection is impractical, so the programmer must

be careful When ‘‘being careful’’ isn’t enough, the programmer can insert explicit checks Forexample:

Trang 30

To truncate in a way that is guaranteed to be portable requires the use of n nu um me er ri ic c_ _l li im it ts s (§22.2).

C.6.3 Usual Arithmetic Conversions

These conversions are performed on the operands of a binary operator to bring them to a commontype, which is then used as the type of the result:

[1] If either operand is of type l lo on g d do ub bl e, the other is converted to l lo on g d do ub bl e.

Otherwise, if either operand is d do ub bl le e, the other is converted to d do ub bl e.

Otherwise, if either operand is f fl lo oa at t, the other is converted to f fl lo oa at t.

– Otherwise, integral promotions (§C.6.1) are performed on both operands

[2] Then, if either operand is u un ns si ig gn ne d l lo on g, the other is converted to u un ns si ig gn ne d l lo on g.

Otherwise, if one operand is a l lo on g i in t and the other is an u un ns si ig gn ne d i in t, then if a l lo on g i in t can represent all the values of an u un ns si ig gn ne d i in t, the u un ns si ig gn ed d i in t is converted to a l lo on g i in t; otherwise, both operands are converted to u un ns si ig gn ne d l lo on g i in t.

Otherwise, if either operand is l lo on g, the other is converted to l lo on g.

Otherwise, if either operand is u un ns si ig gn ne d, the other is converted to u un ns si ig gn ne d.

Otherwise, both operands are i in t.

C.7 Multidimensional Arrays

It is not uncommon to need a vector of vectors, a vector of vector of vectors, etc The issue is how

to represent these multidimensional vectors in C++ Here, I first show how to use the standard

library v ve ct to or r class Next, I present multidimensional arrays as they appear in C and C++ programs

using only built-in facilities

C.7.1 Vectors

The standard v ve ct to or r (§16.3) provides a very general solution:

v ve ct to or r< v ve ct to or r<i in t> > m m;

This creates a vector of vectors of integers that initially contains no elements We could initialize it

to a three-by-five matrix like this:

Trang 31

It is not necessary for the v ve ct or r<i in t>s in the v ve ct or r<v ve ct or r<i in t> >to have the same size

Accessing an element is done by indexing twice For example, m m[i i][j j]is the j jth element of the i ith vector We can print m m like this:

Trang 32

i in t m ma a[3 3][5 5] ; / /3 arrays with 5 ints each

For arrays, the dimensions must be given as part of the definition We can initialize m ma a like this:

The array m ma a is simply 15 i in ts s that we access as if it were 3 arrays of 5 i in ts In particular, there is

no single object in memory that is the matrix m ma a – only the elements are stored The dimensions 3 3 and 5 5 exist in the compiler source only When we write code, it is our job to remember them some- how and supply the dimensions where needed For example, we might print m ma a like this:

i in t b ba d[3 3,5 5] ; / /error: comma not allowed in constant expression

i in t g go od d[3 3][5 5] ; / /3 arrays with 5 ints each

i in t o ou uc h=g go od d[1 1,4 4] ; / /error: int initialized by int* (good[1,4] means good[4], which is an int*)

i in t n ni ic ce e=g go od d[1 1][4 4] ;

C.7.3 Passing Multidimensional Arrays

Consider defining a function to manipulate a two-dimensional matrix If the dimensions are known

at compile time, there is no problem:

Trang 33

Section C.7.3 Passing Multidimensional Arrays 839

A matrix represented as a multidimensional array is passed as a pointer (rather than copied; §5.3).The first dimension of an array is irrelevant to the problem of finding the location of an element; it

simply states how many elements (here 3 3)of the appropriate type (here i in t[5 5]) are present For

example, look at the previous representation of m ma a and note that by our knowing only that the ond dimension is 5 5, we can locate m ma a[i i][5 5] for any i i The first dimension can therefore be

Trang 34

Note the use of&v v[0 0][0 0]for the last call; v v[0 0]would do because it is equivalent, but v v would be

a type error This kind of subtle and messy code is best hidden If you must deal directly with tidimensional arrays, consider encapsulating the code relying on it In that way, you might ease thetask of the next programmer to touch the code Providing a multidimensional array type with aproper subscripting operator saves most users from having to worry about the layout of the data inthe array (§22.4.6)

mul-The standard v ve ct to or r (§16.3) doesn’t suffer from these problems.

C.8 Saving Space

When programming nontrivial applications, there often comes a time when you want more memoryspace than is available or affordable There are two ways of squeezing more space out of what isavailable:

[1] Put more than one small object into a byte

[2] Use the same space to hold different objects at different times

The former can be achieved by using fields, and the latter by using unions These constructs are

described in the following sections Many uses of fields and unions are pure optimizations, andthese optimizations are often based on nonportable assumptions about memory layouts Conse-quently, the programmer should think twice before using them Often, a better approach is tochange the way data is managed, for example, to rely more on dynamically allocated store (§6.2.6)and less on preallocated (static) storage

C.8.1 Fields

It seems extravagant to use a whole byte (a c ch ha ar r or a b bo ol l) to represent a binary variable – for example, an on/off switch – but a c ch ha ar r is the smallest object that can be independently allocated

and addressed in C++ (§5.1) It is possible, however, to bundle several such tiny variables together

as fields in a s st ru uc ct t A member is defined to be a field by specifying the number of bits it is to

occupy Unnamed fields are allowed They do not affect the meaning of the named fields, but theycan be used to make the layout better in some machine-dependent way:

s st ru uc ct t P PP PN N{ / /R6000 Physical Page Number

b

bo ol l field really can be represented by a single bit In an operating system kernel or in a debugger, the type P PP PN N might be used like this:

Trang 35

on most machines Programs have been known to shrink significantly when binary variables were

converted from bit fields to characters! Furthermore, it is typically much faster to access a c ch ha ar r or

an i in t than to access a field Fields are simply a convenient shorthand for using bitwise logical

operators (§6.2.4) to extract information from and insert information into part of a word

Trang 36

This leaves all code using an E En nt tr ry y unchanged.

Using a u un ni on n so that its value is always read using the member through which it was written is

a pure optimization However, it is not always easy to ensure that a u un ni on n is used in this way only, and subtle errors can be introduced through misuse To avoid errors, one can encapsulate a u un ni on n

so that the correspondence between a type field and access to the u un ni on n members can be guaranteed

Trang 37

This is not really a conversion at all On some machines, an i in t and an i in t* do not occupy the

same amount of space, while on others, no integer can have an odd address Such use of a u un ni on n is

dangerous and nonportable, and there is an explicit and portable way of specifying type conversion(§6.2.7)

Unions are occasionally used deliberately to avoid type conversion One might, for example,

use a F Fu ud ge e to find the representation of the pointer 0 0:

C.8.3 Unions and Classes

Many nontrivial u un ni on ns have some members that are much larger than the most frequently-used members Because the size of a u un ni on n is at least as large as its largest member, space is wasted This waste can often be eliminated by using a set of derived classes instead of a u un ni on n.

A class with a constructor, destructor, or copy operation cannot be the type of a u un ni on n member

(§10.4.12) because the compiler would not know which member to destroy

C.9 Memory Management

There are three fundamental ways of using memory in C++:

Static memory, in which an object is allocated by the linker for the duration of the program Global and namespace variables, s st ta ti ic c class members (§10.2.4), and s st ta ti ic c variables in

functions (§7.1.2) are allocated in static memory An object allocated in static memory isconstructed once and persists to the end of the program It always has the same address.Static objects can be a problem in programs using threads (shared-address space concur-rency) because they are shared and require locking for proper access

Automatic memory, in which function arguments and local variables are allocated Each entry

into a function or a block gets its own copy This kind of memory is automatically createdand destroyed; hence the name automatic memory Automatic memory is also said ‘‘to be

on the stack.’’ If you absolutely must be explicit about this, C++ provides the redundant

keyword a au ut o.

Free store, from which memory for objects is explicitly requested by the program and where a program can free memory again once it is done with it (using n ne ew w and d de el et e) When a pro- gram needs more free store, n ne ew w requests it from the operating system Typically, the free

Trang 38

store (also called dynamic memory or the heap) grows throughout the lifetime of a program

because no memory is ever returned to the operating system for use by other programs

As far as the programmer is concerned, automatic and static storage are used in simple, obvious,and implicit ways The interesting question is how to manage the free store Allocation (using

n

ne ew w) is simple, but unless we have a consistent policy for giving memory back to the free store

manager, memory will fill up – especially for long-running programs

The simplest strategy is to use automatic objects to manage corresponding objects in free store.Consequently, many containers are implemented as handles to elements stored in the free store

(§25.7) For example, an automatic S St tr in ng g (§11.12) manages a sequence of characters on the free

store and automatically frees that memory when it itself goes out of scope All of the standard tainers (§16.3, Chapter 17, Chapter 20, §22.4) can be conveniently implemented in this way

con-C.9.1 Automatic Garbage Collection

When this regular approach isn’t sufficient, the programmer might use a memory manager thatfinds unreferenced objects and reclaims their memory in which to store new objects This is usu-

ally called automatic garbage collection, or simply garbage collection Naturally, such a memory manager is called a garbage collector.

The fundamental idea of garbage collection is that an object that is no longer referred to in aprogram will not be accessed again, so its memory can be safely reused for some new object Forexample:

The standard does not require that an implementation supply a garbage collector, but garbagecollectors are increasingly used for C++ in areas where their costs compare favorably to those ofmanual management of free store When comparing costs, consider the run time, memory usage,reliability, portability, monetary cost of programming, monetary cost of a garbage collector, andpredictability of performance

Trang 39

Section C.9.1.1 Disguised Pointers 845

/ /point #1: no pointer to the int exists here

p

p=r re ei in te er pr et t_ _c ca as st t<i in t*>(i i1 1|i i2 2) ;

/ /now the int is referenced again

}

Often, pointers stored as non-pointers in a program are called ‘‘disguised pointers.’’ In particular,

the pointer originally held in p p is disguised in the integers i i1 1 and i i2 2 However, a garbage collector

need not be concerned about disguised pointers If the garbage collector runs at point#1 1, the ory holding the i in t can be reclaimed In fact, such programs are not guaranteed to work even if a garbage collector is not used because the use of r re ei nt er rp re et t_ _c ca as st t to convert between integers and

mem-pointers is at best implementation-defined

A u un ni on n that can hold both pointers and non-pointers presents a garbage collector with a special problem In general, it is not possible to know whether such a u un ni on n contains a pointer Consider:

u un ni on n U U{ / /union with both pointer and non-pointer members

The safe assumption is that any value that appears in such a u un ni on n is a pointer value A clever

gar-bage collector can do somewhat better For example, it may notice that (for a given

implementa-tion) i in ts are not allocated with odd addresses and that no objects are allocated with an address as low as 8 8 Noticing this will save the garbage collector from having to assume that objects contain- ing locations 9 99 99 99 9 and 8 8 are used by f f()

invokes the destructor for the object pointed to by p p (if any) However, reuse of the memory can be

postponed until it is collected Recycling lots of objects at once can help limit fragmentation(§C.9.1.4) It also renders harmless the otherwise serious mistake of deleting an object twice in theimportant case where the destructor simply deletes memory

As always, access to an object after it has been deleted is undefined

Trang 40

C.9.1.3 Destructors

When an object is about to be recycled by a garbage collector, two alternatives exist:

[1] Call the destructor (if any) for the object

[2] Treat the object as raw memory (don’t call its destructor)

By default, a garbage collector should choose option (2) because objects created using n ne ew w and never d de el et ed are never destroyed Thus, one can see a garbage collector as a mechanism for simu-

lating an infinite memory

It is possible to design a garbage collector to invoke the destructors for objects that have beenspecifically ‘‘registered’’ with the collector However, there is no standard way of ‘‘registering’’objects Note that it is always important to destroy objects in an order that ensures that thedestructor for one object doesn’t refer to an object that has been previously destroyed Such order-ing isn’t easily achieved by a garbage collector without help from the programmer

C.9.1.4 Memory Fragmentation

When a lot of objects of varying sizes are allocated and freed, the memory fragments That is,

much of memory is consumed by pieces of memory that are too small to use effectively The son is that a general allocator cannot always find a piece of memory of the exact right size for anobject Using a slightly larger piece means that a smaller fragment of memory remains After run-ning a program for a while with a naive allocator, it is not uncommon to find half the availablememory taken up with fragments too small ever to get reused

rea-Several techniques exist for coping with fragmentation The simplest is to request only largerchunks of memory from the allocator and use each such chunk for objects of the same size (§15.3,

§19.4.2) Because most allocations and deallocations are of small objects of types such as treenodes, links, etc., this technique can be very effective An allocator can sometimes apply similartechniques automatically In either case, fragmentation is further reduced if all of the larger

‘‘chunks’’ are of the same size (say, the size of a page) so that they themselves can be allocated andreallocated without fragmentation

There are two main styles of garbage collectors:

[1] A copying collector moves objects in memory to compact fragmented space.

[2] A conservative collector allocates objects to minimize fragmentation.

From a C++ point of view, conservative collectors are preferable because it is very hard (probablyimpossible in real programs) to move an object and modify all pointers to it correctly A conserva-tive collector also allows C++ code fragments to coexist with code written in languages such as C.Traditionally, copying collectors have been favored by people using languages (such as Lisp andSmalltalk) that deal with objects only indirectly through unique pointers or references However,modern conservative collectors seem to be at least as efficient as copying collectors for larger pro-grams, in which the amount of copying and the interaction between the allocator and a paging sys-tem become important For smaller programs, the ideal of simply never invoking the collector isoften achievable – especially in C++, where many objects are naturally automatic

Ngày đăng: 12/08/2014, 19:21

TỪ KHÓA LIÊN QUAN