Pointers, References and Values by Michael D. Crawford Continued...

How to Pass Parameters to Functions

Prefer passing by const reference. Use a non-const reference
when the parameter is used to send a result back to the caller.

Prefer Passing Const Reference Parameters to Functions

Most of the time, but not always, you should pass const reference parameters to functions, like this:

class OneKind;
class AnotherKind;

class Example{
   public:
      Example( const OneKind &inParam );    // conversion operator
      Example( const Example &inOriginal ); // copy constructor

      // assignment operator
      const Example &operator=( const Example& inRhs );

      void MemberFunc( const AnotherKind &inParam );
};

The reasons to prefer passing references are for performance and to preserve polymorphism, as well as to indicate that the object is guaranteed to exist. If you might be passing a non-existent object, pass a pointer instead, with the understanding that it might be nil.

Reference Parameters and Performance

Performance can be an issue because passing a parameter by value will call the parameter class' copy constructor to construct the parameter value on the runtime stack and its destructor when the called function exits. This affects performance in several ways:

First, the object actually takes up space on the stack. If the object is large or stack space at a premium, this will be a problem. Also note that even if you have lots of stack space available to your program, you may impact performance more than you think by using it, because on modern processors you will cause cache line flushes (that is, valuable cached data will be flushed out, and the new data will be placed into the cache, only to be flushed out of the cache at a later time). In extreme cases of stack usage, you will encourage virtual memory paging, which is very slow.

Stack usage is especially to be avoided in two cases: multithreaded code, and recursive code. Multithreaded code is an issue because of the way the stack is implemented differently from a single-threaded process.

In a typical single threaded process on a modern operating system, the stack is at the top of the allocated user memory (perhaps at the top of user virtual memory), and grows downward. If the stack space is exceeded and a page fault occurs, the virtual memory system will allocate more virtual memory pages for the stack and allow the process to continue. The stack can grow pretty much as much as it wants, unless some error condition like infinite recursion causes it to grow without bound and collide with the heap.

Even on more primitive operating systems like the classic MacOS, the user stack starts at the top of user program data memory and grows down and has the entire distance from the stack base to the heap top to grow.

But in a multithreaded system, each thread needs to have its own stack. The thread scheduler switches the stack as each thread takes control, so the stack needs to stay in a fixed location for the duration of a thread's lifetime. What is usually done, by default with some APIs and always with others, is that a fixed amount of virtual memory is allocated for each thread stack.

For example, on the BeOS, 256 kb of virtual memory is allocated per thread stack in a virtual memory region called an area. The thread stack is not resizable and one doesn't have the option of asking for a different sized stack when creating a thread. Creating too many threads will run the process out of virtual memory and bring the machine to its knees because it cannot create any more threads.

(Because the thread stack maximum size is fixed once the thread is created, it is helpful to specify a smaller stack size than the default if your thread API gives you that option. That way you can remove one of the impediments to creating lots of threads. An unnecessarily large stack allocation may not be too big a problem because you won't cause paging if you don't actually make use of it, but you may still use up system resources and prevent allocation of virtual memory pages to other threads or processes that might take advantage of the pages you abstain from allocating.)

Stack size is also a big issue during recursion. Along the main call chain of the recursive routines, you should not pass large parameters or store large local variables by value, or have large numbers of them. Doing so limits the depth of recursion you can have without crashing, and will also slow down your recursion because of cache thrashing and calling constructors and destructors.

Calling the constructor impacts performance not just because your program has to run the constructor's code, but because again the cache will be affected. Calling lots of unneeded constructors and destructors will remove useful code from the cache, and you may cause VM paging as the executable code containing the constructors and destructors is loaded.

For small objects whose members are contained by value (as whole objects) - and recursively so, that is, the members' members are also contained by value - as long as the objects are quite small this is a case where inlining the constructors will help performance. If you do need to pass a parameter by value, consider whether to inline the constructor and destructor of the parameter's class.

I discuss inlining in more detail later on, but I think that inlining is commonly poorly used. Either it is not used at all in a project, or it is overused. It is important to inline judiciously.

It is instructive to run some C++ code that tends to pass by value a lot and run it in a source code debugger. Use the "step into" function of the debugger to walk down into a subroutine call. Watch the debugger step into the constructors for each of the parameters and then back out again.

You won't see it in the source code, but at the end of the called function the destructors for each of those parameters will be called. Your debugger might allow you to step into the invisible destructor calls if you step up to the last line of the function and then use the "step into" command. You can also try compiling to assembly code and reading it; you will see the calls to the destructors.

Reference Parameters Preserve Polymorphism

Passing parameters by reference avoids the slicing problem.

That is, when you pass an object of a derived class to a function that is declared to take a parameter of its base class, the base class' copy constructor is called with a reference to the derived class object as its parameter.

This is legal, because that's just what the concept of inheritance says you can do - you can use a derived class' object wherever a base class is expected. So the recipient function will wind up with a base class object for its local use.

This might not cause problems if the base class has no virtual functions. In that case, all the functions called on the parameter by the recipient function would always be in the base class.

The problem comes when polymorphism is being used, that is when the base class has virtual functions, and one or more of them have been overridden by the derived class. When the base class object is constructed, the derived class' behavior - its virtual functions - are "sliced off". So it's not just that you get an object whose data members only include those from the base class, but the behaviour will be the behaviour of the base class and not the derived class whose object you attempted to pass in. Often, this may not be what you want.

It is conceivable that you could construct a base class that is not valid when it is constructed from a derived class. This is something to be avoided, but is possible to do. We already know that you cannot construct actual objects whose type is an abstract base class (ah, but I have succeeded in actually doing so, by a combination of weird code and a compiler bug - interestingly, the pure virtual function had a nil vtbl entry and would crash when called at runtime).

Another possible way to construct an invalid base class object would be for a derived class to modify a data member in the base class, whether directly or via a member function in the base class, in such a way that the value of the data member causes proper behaviour when used by the derived class but not by the base class. An example of how one can get an actual object of a base class with one of these invalid data members is to use the base class' copy constructor, as when passing by value.

A reference parameter still has the polymorphism, and so the actual object seen in the recipient function will be the derived class even though the parameter is a base class reference. So usually you should prefer reference parameters.

But to be safe, always design and implement your classes so that each layer of an inheritance heirarchy is always correct as it stands and doesn't screw with the other layers. But never design a class heirarchy that has the behaviour that base class objects will be invalid if slicing should occur.

Perhaps one way of helping to prevent that problem would be to make the copy constructor of the base class private. Some designs involve classes that really are always meant to be used as pointers, and if you want to forbid members of these class heirarchies from ever being passed by value, then declare the base class' copy constructor private and then just don't write an implementation for it:

class Base{
   public:
      Base();

   private:
      Base( const Base &inOriginal );   // don't write implementation
};

One case where it is OK to pass a parameter by value is where the class of the parameter is a concrete class that is not meant to be a base class. One advantage I do see in Java is that such classes can be declared final. In C++, such classes do not have virtual member functions. Also the size of the class should be small - it should contain few data members, and the data members it does contain should be small themselves.

When to Pass Non-Const References to Functions

A non-const reference is passed to the function when the function is expected (or encouraged) to modify the original object, and the original object is guaranteed to exist. Pass a non-const pointer if you expect the function to modify an original object which sometimes may not exist. You can also pass nil pointers (const or non-const) to notify a function that some object has ceased to exist or may no longer be accessible, for example to tell a GUI widget that it has been detached from its parent pane.

Pass a const reference to indicate it is not meant to modify the original - but note that even const references do not guarantee that the original will not be modified. If for some reason you may not trust the called function, as might be the case when security is a concern or you're writing a proprietary library and want to call a user-supplied callback function without fear they'll screw with your internals, then pass the parameter by value.

Do you want to pass by const reference but guard against the possibility of const_casting? The following can be used in debug builds. The #define constants used are from the Zoolib cross-platform application framework:

// Base.h
class Param;

class Base{
   public:
      void BaseFunction();
      virtual void VirtualFunction( const Param &inParam );
}

// Derived.h
class Derived: public Base{
   public:
      virtual void VirtualFunction( const Param &inParam );
}

// Base.cpp
#include "Base.h"
#include "Param.h"

void Base::BaseFunction()
{
	Param myParam;

#if ZCONFIG_Debug > 0
	Param tempParam( myParam );
#endif

	VirtualFunction( myParam );  // takes const, but could be cast away

	ZAssert( tempParam == myParam );

	return;
}

Now if the derived class screws with it, we'll get an assertion:

// Derived.cpp

void Derived::VirtualFunction( const Param &inParam )
{
   Param &nonConstRef = const_cast< Param& >( inParam );

   // will cause an assertion later
   nonConstRef.SetMemberValue( 1 + inParam.GetMemberValue() );

   return;
}

This won't guard against security problems though, as the derived class just won't use a debug build. Pass by value instead.

Upon some further examination, I think maybe you don't have to worry about const_cast in a function. I just tried it and although the function appeared to have mutated its parameter, after the function returned, the value whose reference had passed was still the original value:

#include <iostream>

void Bar( const int &arg );

int main( int argc, char **argv )
{
   using namespace std;

   const int a = 1;

   Bar( a );

   cout << "after Bar a = " << a << "\n";   // prints 1!

}

void Bar( const int &arg )
{
   using namespace std;
   int &argRef = const_cast< int & >( arg );

   cout << "before assignment arg = " << arg << "\n";   // prints 1

   argRef = 2;

   cout << "after assignment arg = " << arg << "\n";   // prints 2
   cout << "argRef = " << argRef << "\n";   // prints 2

   return;
}

The C++ ISO Standard specifies that the behaviour is undefined if you use const_cast to change a const value. Imagine for example that the value is burned into ROM, on a memory page that is made read-only by the CPU's memory management unit, in a read-only status register on a hardware card or is taken from the settings of a DIP switch.

The only permissible use of const_cast is to pass a const reference or pointer to legacy code that you are sure does not mutate the parameter, but does not declare its parameter const for historical reasons (it was written before the const keyword was invented) or its author just did not bother to declare a parameter const even when it was not meant to change.

next button previous page contents all programming tips titles

Copyright © 2000, 2001, 2002, 2005 Michael D. Crawford. All Rights Reserved.

One Must Not Trifle With Wizards For It Makes Us Soggy And Hard To Light