If you figure out how to generate assembly code with your compiler and determine the statements generated by the function call to f , you’ll get the equivalent of: This code has been cl
Trang 1defined types by value during function calls It’s so important, in
fact, that the compiler will automatically synthesize a
copy-constructor if you don’t provide one yourself, as you will see
Passing & returning by value
To understand the need for the copy-constructor, consider the way
C handles passing and returning variables by value during function
calls If you declare a function and make a function call,
int f(int x, char c);
int g = f(a, b);
how does the compiler know how to pass and return those
variables? It just knows! The range of the types it must deal with is
so small – char, int, float, double, and their variations – that this
information is built into the compiler
If you figure out how to generate assembly code with your
compiler and determine the statements generated by the function
call to f( ), you’ll get the equivalent of:
This code has been cleaned up significantly to make it generic; the
expressions for b and a will be different depending on whether the
variables are global (in which case they will be _b and _a) or local
(the compiler will index them off the stack pointer) This is also true
for the expression for g The appearance of the call to f( ) will
depend on your name-decoration scheme, and “register a” depends
on how the CPU registers are named within your assembler The
logic behind the code, however, will remain the same
In C and C++, arguments are first pushed on the stack from right to
left, then the function call is made The calling code is responsible
Trang 2for cleaning the arguments off the stack (which accounts for the
add sp,4) But notice that to pass the arguments by value, the
compiler simply pushes copies on the stack – it knows how big they are and that pushing those arguments makes accurate copies
of them
The return value of f( ) is placed in a register Again, the compiler
knows everything there is to know about the return value type because that type is built into the language, so the compiler can return it by placing it in a register With the primitive data types in
C, the simple act of copying the bits of the value is equivalent to copying the object
Passing & returning large objects
But now consider user-defined types If you create a class and you want to pass an object of that class by value, how is the compiler supposed to know what to do? This is not a type built into the
compiler; it’s a type you have created
To investigate this, you can start with a simple structure that is clearly too large to return in registers:
Trang 3all functionality inline In main( ), the call to bigfun( ) starts as you
might guess – the entire contents of B is pushed on the stack (Here,
you might see some compilers load registers with the address of
the Big and its size, then call a helper function to push the Big onto
the stack.)
In the previous code fragment, pushing the arguments onto the
stack was all that was required before making the function call In
PassingBigStructures.cpp, however, you’ll see an additional
action: the address of B2 is pushed before making the call, even
though it’s obviously not an argument To comprehend what’s
going on here, you need to understand the constraints on the
compiler when it’s making a function call
Function-call stack frame
When the compiler generates code for a function call, it first pushes
all the arguments on the stack, then makes the call Inside the
function, code is generated to move the stack pointer down even
farther to provide storage for the function’s local variables
(“Down” is relative here; your machine may increment or
decrement the stack pointer during a push.) But during the
assembly-language CALL, the CPU pushes the address in the
program code where the function call came from, so the
assembly-language RETURN can use that address to return to the calling
point This address is of course sacred, because without it your
program will get completely lost Here’s what the stack frame looks
like after the CALL and the allocation of local variable storage in
the function:
Function arguments Return address Local variables
Trang 4The code generated for the rest of the function expects the memory
to be laid out exactly this way, so that it can carefully pick from the function arguments and local variables without touching the return address I shall call this block of memory, which is everything used
by a function in the process of the function call, the function frame
You might think it reasonable to try to return values on the stack The compiler could simply push it, and the function could return
an offset to indicate how far down in the stack the return value begins
Re-entrancy
The problem occurs because functions in C and C++ support
interrupts; that is, the languages are re-entrant They also support
recursive function calls This means that at any point in the
execution of a program an interrupt can occur without breaking the program Of course, the person who writes the interrupt service routine (ISR) is responsible for saving and restoring all the registers that are used in the ISR, but if the ISR needs to use any memory further down on the stack, this must be a safe thing to do (You can think of an ISR as an ordinary function with no arguments and
void return value that saves and restores the CPU state An ISR
function call is triggered by some hardware event instead of an explicit call from within a program.)
Now imagine what would happen if an ordinary function tried to return values on the stack You can’t touch any part of the stack that’s above the return address, so the function would have to push the values below the return address But when the assembly-
language RETURN is executed, the stack pointer must be pointing
to the return address (or right below it, depending on your
machine), so right before the RETURN, the function must move the stack pointer up, thus clearing off all its local variables If you’re trying to return values on the stack below the return address, you become vulnerable at that moment because an interrupt could
come along The ISR would move the stack pointer down to hold
Trang 5its return address and its local variables and overwrite your return
value
To solve this problem, the caller could be responsible for allocating
the extra storage on the stack for the return values before calling
the function However, C was not designed this way, and C++
must be compatible As you’ll see shortly, the C++ compiler uses a
more efficient scheme
Your next idea might be to return the value in some global data
area, but this doesn’t work either Reentrancy means that any
function can be an interrupt routine for any other function,
including the same function you’re currently inside Thus, if you put
the return value in a global area, you might return into the same
function, which would overwrite that return value The same logic
applies to recursion
The only safe place to return values is in the registers, so you’re
back to the problem of what to do when the registers aren’t large
enough to hold the return value The answer is to push the address
of the return value’s destination on the stack as one of the function
arguments, and let the function copy the return information
directly into the destination This not only solves all the problems,
it’s more efficient It’s also the reason that, in
PassingBigStructures.cpp , the compiler pushes the address of B2
before the call to bigfun( ) in main( ) If you look at the assembly
output for bigfun( ), you can see it expects this hidden argument
and performs the copy to the destination inside the function
Bitcopy versus initialization
So far, so good There’s a workable process for passing and
returning large simple structures But notice that all you have is a
way to copy the bits from one place to another, which certainly
works fine for the primitive way that C looks at variables But in
C++ objects can be much more sophisticated than a patch of bits;
they have meaning This meaning may not respond well to having
its bits copied
Trang 6Consider a simple example: a class that knows how many objects of its type exist at any one time From Chapter 10, you know the way
to do this is by including a static data member:
Trang 7along with an optional message argument The constructor
increments the count each time an object is created, and the
destructor decrements it
The output, however, is not what you would expect:
after construction of h: objectCount = 1
x argument inside f(): objectCount = 1
~HowMany(): objectCount = 0
after call to f(): objectCount = 0
~HowMany(): objectCount = -1
~HowMany(): objectCount = -2
After h is created, the object count is one, which is fine But after
the call to f( ) you would expect to have an object count of two,
because h2 is now in scope as well Instead, the count is zero, which
indicates something has gone horribly wrong This is confirmed by
the fact that the two destructors at the end make the object count go
negative, something that should never happen
Look at the point inside f( ), which occurs after the argument is
passed by value This means the original object h exists outside the
function frame, and there’s an additional object inside the function
frame, which is the copy that has been passed by value However,
the argument has been passed using C’s primitive notion of
bitcopying, whereas the C++ HowMany class requires true
initialization to maintain its integrity, so the default bitcopy fails to
produce the desired effect
When the local object goes out of scope at the end of the call to f( ),
the destructor is called, which decrements objectCount, so outside
the function, objectCount is zero The creation of h2 is also
performed using a bitcopy, so the constructor isn’t called there
either, and when h and h2 go out of scope, their destructors cause
the negative values of objectCount
Trang 8object outside the function frame This is also often true when
returning an object from a function In the expression
HowMany h2 = f(h);
h2, a previously unconstructed object, is created from the return
value of f( ), so again a new object is created from an existing one
The compiler’s assumption is that you want to perform this
creation using a bitcopy, and in many cases this may work fine, but
in HowMany it doesn’t fly because the meaning of initialization
goes beyond simply copying Another common example occurs if the class contains pointers – what do they point to, and should you copy them or should they be connected to some new piece of
memory?
Fortunately, you can intervene in this process and prevent the
compiler from doing a bitcopy You do this by defining your own function to be used whenever the compiler needs to make a new object from an existing object Logically enough, you’re making a new object, so this function is a constructor, and also logically
enough, the single argument to this constructor has to do with the object you’re constructing from But that object can’t be passed into the constructor by value because you’re trying to define the function
that handles passing by value, and syntactically it doesn’t make sense to pass a pointer because, after all, you’re creating the new object from an existing object Here, references come to the rescue,
so you take the reference of the source object This function is called the copy-constructor and is often referred to as X(X&), which is its
appearance for a class called X
Trang 9If you create a copy-constructor, the compiler will not perform a
bitcopy when creating a new object from an existing one It will
always call your constructor So, if you don’t create a
copy-constructor, the compiler will do something sensible, but you have
the choice of taking over complete control of the process
Now it’s possible to fix the problem in HowMany.cpp:
string name; // Object identifier
static int objectCount;
Trang 10// Pass and return BY VALUE:
HowMany2 f(HowMany2 x) {
x.print("x argument inside f()");
out << "Returning from f()" << endl;
h2.print("h2 after call to f()");
out << "Call f(), no return value" << endl;
f(h);
out << "After call to f()" << endl;
} ///:~
There are a number of new twists thrown in here so you can get a
better idea of what’s happening First, the string name acts as an
object identifier when information about that object is printed In the constructor, you can put an identifier string (usually the name
of the object) that is copied to name using the string constructor The default = "" creates an empty string The constructor
increments the objectCount as before, and the destructor
decrements it
Next is the copy-constructor, HowMany2(const HowMany2&) The copy-constructor can create a new object only from an existing
one, so the existing object’s name is copied to name, followed by
the word “copy” so you can see where it came from If you look
closely, you’ll see that the call name(h.name) in the constructor initializer list is actually calling the string copy-constructor
Inside the copy-constructor, the object count is incremented just as
it is inside the normal constructor This means you’ll now get an accurate object count when passing and returning by value
The print( ) function has been modified to print out a message, the object identifier, and the object count It must now access the name
Trang 11data of a particular object, so it can no longer be a static member
function
Inside main( ), you can see that a second call to f( ) has been added
However, this call uses the common C approach of ignoring the
return value But now that you know how the value is returned
(that is, code inside the function handles the return process, putting
the result in a destination whose address is passed as a hidden
argument), you might wonder what happens when the return
value is ignored The output of the program will throw some
illumination on this
Before showing the output, here’s a little program that uses
iostreams to add line numbers to any file:
int main(int argc, char* argv[]) {
requireArgs(argc, 1, "Usage: linenum file\n"
"Adds line numbers to file");
// Number of lines in file determines width:
const int width = int(log10(lines.size())) + 1;
for(int i = 0; i < lines.size(); i++) {
cout.setf(ios::right, ios::adjustfield);
cout.width(width);
Trang 12cout << ++num << ") " << lines[i] << endl;
}
} ///:~
The entire file is read into a vector<string>, using the same code
that you’ve seen earlier in the book When printing the line
numbers, we’d like all the lines to be aligned with each other, and this requires adjusting for the number of lines in the file so that the width allowed for the line numbers is consistent We can easily
determine the number of lines using vector::size( ), but what we
really need to know is whether there are more than 10 lines, 100 lines, 1,000 lines, etc If you take the logarithm, base 10, of the
number of lines in the file, truncate it to an int and add one to the
value, you’ll find out the maximum width that your line count will
be
You’ll notice a couple of strange calls inside the for loop: setf( ) and
width( ) These are ostream calls that allow you to control, in this
case, the justification and width of the output However, they must
be called each time a line is output and that is why they are inside
the for loop Volume 2 of this book has an entire chapter explaining
iostreams that will tell you more about these calls as well as other ways to control iostreams
When Linenum.cpp is applied to HowMany2.out, the result is
Trang 1315) Call f(), no return value
As you would expect, the first thing that happens is that the normal
constructor is called for h, which increments the object count to
one But then, as f( ) is entered, the copy-constructor is quietly
called by the compiler to perform the pass-by-value A new object
is created, which is the copy of h (thus the name “h copy”) inside
the function frame of f( ), so the object count becomes two, courtesy
of the copy-constructor
Line eight indicates the beginning of the return from f( ) But before
the local variable “h copy” can be destroyed (it goes out of scope at
the end of the function), it must be copied into the return value,
which happens to be h2 A previously unconstructed object (h2) is
created from an existing object (the local variable inside f( )), so of
course the copy-constructor is used again in line nine Now the
name becomes “h copy copy” for h2’s identifier because it’s being
copied from the copy that is the local object inside f( ) After the
object is returned, but before the function ends, the object count
becomes temporarily three, but then the local object “h copy” is
destroyed After the call to f( ) completes in line 13, there are only
two objects, h and h2, and you can see that h2 did indeed end up as
“h copy copy.”
Trang 14Temporary objects
Line 15 begins the call to f(h), this time ignoring the return value
You can see in line 16 that the copy-constructor is called just as before to pass the argument in And also, as before, line 21 shows the copy-constructor is called for the return value But the copy-constructor must have an address to work on as its destination (a
this pointer) Where does this address come from?
It turns out the compiler can create a temporary object whenever it needs one to properly evaluate an expression In this case it creates one you don’t even see to act as the destination for the ignored
return value of f( ) The lifetime of this temporary object is as short
as possible so the landscape doesn’t get cluttered up with
temporaries waiting to be destroyed and taking up valuable
resources In some cases, the temporary might immediately be passed to another function, but in this case it isn’t needed after the function call, so as soon as the function call ends by calling the destructor for the local object (lines 23 and 24), the temporary object
is destroyed (lines 25 and 26)
Finally, in lines 28-31, the h2 object is destroyed, followed by h, and
the object count goes correctly back to zero
Default copy-constructor
Because the copy-constructor implements pass and return by value, it’s important that the compiler creates one for you in the case of simple structures – effectively, the same thing it does in C
However, all you’ve seen so far is the default primitive behavior: a bitcopy
When more complex types are involved, the C++ compiler will still automatically create a copy-constructor if you don’t make one Again, however, a bitcopy doesn’t make sense, because it doesn’t necessarily implement the proper meaning
Trang 15Here’s an example to show the more intelligent approach the
compiler takes Suppose you create a new class composed of objects
of several existing classes This is called, appropriately enough,
composition, and it’s one of the ways you can make new classes from
existing classes Now take the role of a naive user who’s trying to
solve a problem quickly by creating a new class this way You don’t
know about copy-constructors, so you don’t create one The
example demonstrates what the compiler does while creating the
default copy-constructor for your new class:
WoCC(const string& ident = "") : id(ident) {}
void print(const string& msg = "") const {
Trang 16The class WithCC contains a copy-constructor, which simply
announces that it has been called, and this brings up an interesting
issue In the class Composite, an object of WithCC is created using
a default constructor If there were no constructors at all in
WithCC, the compiler would automatically create a default
constructor, which would do nothing in this case However, if you add a copy-constructor, you’ve told the compiler you’re going to handle constructor creation, so it no longer creates a default
constructor for you and will complain unless you explicitly create a
default constructor as was done for WithCC
The class WoCC has no copy-constructor, but its constructor will store a message in an internal string that can be printed out using
print( ) This constructor is explicitly called in Composite’s
constructor initializer list (briefly introduced in Chapter 8 and
covered fully in Chapter 14) The reason for this becomes apparent later
The class Composite has member objects of both WithCC and
WoCC (note the embedded object wocc is initialized in the
constructor-initializer list, as it must be), and no explicitly defined
copy-constructor However, in main( ) an object is created using the
copy-constructor in the definition:
Composite c2 = c;
Trang 17The copy-constructor for Composite is created automatically by the
compiler, and the output of the program reveals the way that it is
To create a copy-constructor for a class that uses composition (and
inheritance, which is introduced in Chapter 14), the compiler
recursively calls the copy-constructors for all the member objects
and base classes That is, if the member object also contains another
object, its copy-constructor is also called So in this case, the
compiler calls the copy-constructor for WithCC The output shows
this constructor being called Because WoCC has no
copy-constructor, the compiler creates one for it that just performs a
bitcopy, and calls that inside the Composite copy-constructor The
call to Composite::print( ) in main shows that this happens because
the contents of c2.wocc are identical to the contents of c.wocc The
process the compiler goes through to synthesize a copy-constructor
is called memberwise initialization
It’s always best to create your own copy-constructor instead of
letting the compiler do it for you This guarantees that it will be
under your control
Alternatives to copy-construction
At this point your head may be swimming, and you might be
wondering how you could have possibly written a working class
without knowing about the copy-constructor But remember: You
need a copy-constructor only if you’re going to pass an object of
your class by value If that never happens, you don’t need a
copy-constructor
Trang 18Preventing pass-by-value
“But,” you say, “if I don’t make a copy-constructor, the compiler will create one for me So how do I know that an object will never
be passed by value?”
There’s a simple technique for preventing pass-by-value: declare a
private copy-constructor You don’t even need to create a
definition, unless one of your member functions or a friend
function needs to perform a pass-by-value If the user tries to pass
or return the object by value, the compiler will produce an error
message because the copy-constructor is private It can no longer
create a default copy-constructor because you’ve explicitly stated that you’re taking over that job
//! f(n); // Error: copy-constructor called
//! NoCC n2 = n; // Error: c-c called
//! NoCC n3(n); // Error: c-c called
Trang 19Functions that modify outside objects
Reference syntax is nicer to use than pointer syntax, yet it clouds
the meaning for the reader For example, in the iostreams library
one overloaded version of the get( ) function takes a char& as an
argument, and the whole point of the function is to modify its
argument by inserting the result of the get( ) However, when you
read code using this function it’s not immediately obvious to you
that the outside object is being modified:
char c;
cin.get(c);
Instead, the function call looks like a pass-by-value, which suggests
the outside object is not modified
Because of this, it’s probably safer from a code maintenance
standpoint to use pointers when you’re passing the address of an
argument to modify If you always pass addresses as const
references except when you intend to modify the outside object via
the address, where you pass by non-const pointer, then your code
is far easier for the reader to follow
Pointers to members
A pointer is a variable that holds the address of some location You
can change what a pointer selects at runtime, and the destination of
the pointer can be either data or a function The C++
pointer-to-member follows this same concept, except that what it
selects is a location inside a class The dilemma here is that a
pointer needs an address, but there is no “address” inside a class;
selecting a member of a class means offsetting into that class You
can’t produce an actual address until you combine that offset with
the starting address of a particular object The syntax of pointers to
members requires that you select an object at the same time you’re
dereferencing the pointer to member
Trang 20To understand this syntax, consider a simple structure, with a
pointer sp and an object so for this structure You can select
members with the syntax shown:
Finally, consider what happens if you have a pointer that happens
to point to something inside a class object, even if it does in fact represent an offset into the object To access what it’s pointing at,
you must dereference it with * But it’s an offset into an object, so you must also refer to that particular object Thus, the * is combined with the object dereference So the new syntax becomes –>* for a pointer to an object, and * for the object or a reference, like this:
Trang 21int ObjectClass::*pointerToMember = &ObjectClass::a;
There is actually no “address” of ObjectClass::a because you’re just
referring to the class and not an object of that class Thus,
&ObjectClass::a can be used only as pointer-to-member syntax
Here’s an example that shows how to create and use pointers to
Obviously, these are too awkward to use anywhere except for
special cases (which is exactly what they were intended for)
Also, pointers to members are quite limited: they can be assigned
only to a specific location inside a class You could not, for example,
increment or compare them as you can with ordinary pointers
Trang 22Functions
A similar exercise produces the pointer-to-member syntax for
member functions A pointer to a function (introduced at the end of Chapter 3) is defined like this:
int (*fp)(float);
The parentheses around (*fp) are necessary to force the compiler to
evaluate the definition properly Without them this would appear
to be a function that returns an int*
Parentheses also play an important role when defining and using pointers to member functions If you have a function inside a class, you define a pointer to that member function by inserting the class name and scope resolution operator into an ordinary function
int (Simple2::*fp)(float) const;
int (Simple2::*fp2)(float) const = &Simple2::f;
int main() {
fp = &Simple2::f;
} ///:~
In the definition for fp2 you can see that a pointer to member
function can also be initialized when it is created, or at any other
time Unlike non-member functions, the & is not optional when
taking the address of a member function However, you can give the function identifier without an argument list, because overload resolution can be determined by the type of the pointer to member
Trang 23behavior at runtime A pointer-to-member is no different; it allows
you to choose a member at runtime Typically, your classes will
only have member functions publicly visible (data members are
usually considered part of the underlying implementation), so the
following example selects member functions at runtime
void f(int) const { cout << "Widget::f()\n"; }
void g(int) const { cout << "Widget::g()\n"; }
void h(int) const { cout << "Widget::h()\n"; }
void i(int) const { cout << "Widget::i()\n"; }
Of course, it isn’t particularly reasonable to expect the casual user
to create such complicated expressions If the user must directly
manipulate a pointer-to-member, then a typedef is in order To
really clean things up, you can use the pointer-to-member as part of
the internal implementation mechanism Here’s the preceding
example using a pointer-to-member inside the class All the user
needs to do is pass a number in to select a function.1
Trang 24class Widget {
void f(int) const { cout << "Widget::f()\n"; }
void g(int) const { cout << "Widget::g()\n"; }
void h(int) const { cout << "Widget::h()\n"; }
void i(int) const { cout << "Widget::i()\n"; }
In the class interface and in main( ), you can see that the entire
implementation, including the functions, has been hidden away
The code must even ask for the count( ) of functions This way, the
class implementer can change the quantity of functions in the
underlying implementation without affecting the code where the class is used
The initialization of the pointers-to-members in the constructor may seem overspecified Shouldn’t you be able to say
fptr[1] = &g;
because the name g occurs in the member function, which is
automatically in the scope of the class? The problem is this doesn’t conform to the pointer-to-member syntax, which is required so
Trang 25everyone, especially the compiler, can figure out what’s going on
Similarly, when the pointer-to-member is dereferenced, it seems
like
(this->*fptr[i])(j);
is also over-specified; this looks redundant Again, the syntax
requires that a pointer-to-member always be bound to an object
when it is dereferenced
Summary
Pointers in C++ are almost identical to pointers in C, which is good
Otherwise, a lot of C code wouldn’t compile properly under C++
The only compile-time errors you will produce occur with
dangerous assignments If these are in fact what are intended, the
compile-time errors can be removed with a simple (and explicit!)
cast
C++ also adds the reference from Algol and Pascal, which is like a
constant pointer that is automatically dereferenced by the compiler
A reference holds an address, but you treat it like an object
References are essential for clean syntax with operator overloading
(the subject of the next chapter), but they also add syntactic
convenience for passing and returning objects for ordinary
functions
The copy-constructor takes a reference to an existing object of the
same type as its argument, and it is used to create a new object
from an existing one The compiler automatically calls the
copy-constructor when you pass or return an object by value Although
the compiler will automatically create a copy-constructor for you, if
you think one will be needed for your class, you should always
define it yourself to ensure that the proper behavior occurs If you
don’t want the object passed or returned by value, you should
create a private copy-constructor