Introduction
This article presents a method to eliminate redundant memory copying in C++ operators that return a new object instance.
Background
One thing that has always annoyed me about the operators in C++ classes is the propensity for generating redundant memory copying. This problem can become severe if the instance contains a lot of data, such as a big matrix.
Let's say you have a MyType operator+ ( const MyType& other )
in your class. The operator is required to return an instance of MyType, which is easy, You just declare one in the function body, initialize it, and return it.
But.. when you return that instance, it goes out of scope because it's local, so C++ has to copy the data that you already initialized with a call to the copy constructor. Your object has to be constructed again!
This is the redundant memory copy I refer to.
Finding a Solution
To circumvent this copying, your operator can return an instance as it's being constructed, like so:
return MyType( SomeParameter );
That's sweet, now you got rid of the copy construction call, and you're on your merry way.. unless your class needs lots of data to initialize itself.
Let's say your class is a matrix class (simplified here):
class Mat8x8 {
double M[ 8*8 ];
public:
Mat8x8() { memset( M, 0, 64*sizeof(double) ); }
Mat8x8( const Mat8x8& other ) { memcpy( M, other.M, 512 ); }
double& operator() ( int row, int col ) { return M[ row*8 + col ]; }
Mat8x8 operator + ( const Mat8x8& other );
};
Your operator+
has to add each of the elements in the right hand instance to the left hand instance, and return an initialized instance with the result. You think about it for a bit, and figure out you can calculate the result in a temp array and use a constructor that takes your array as input.
Mat8x8::Mat8x8( const double* data ) { memcpy( M, data, 512 ); }
Mat8x8::Mat8x8 operator + ( const Mat8x8& other ) {
double Sum[64];
for( int i=0; i<64; ++i )
Sum[i] = M[i] + other.M[i];
return Mat8x8( Sum );
}
Well, at least you don't have to construct your instance twice, but you're not past the copying of initialized data. You build your result in a memory block that has to be copied to the memory C++ decides to use for the object instance. What you really want is a way to build your result straight into the memory the compiler decides to use for your instance. But how can you do that? You can't compute the result because you don't know where to put it, and you can't get somewhere to put it because you don't have anything to pass to the c'tor.
Some people try to get around this by defining a placement 'new' for the class, so they can get the c'tor to build in the memory they want, but that will, of course, only work for dynamically allocated instances. There has to be some other way. Which brings us to 'My way'..
A good old function pointer!
The following code listing embodies the method, and should speak for itself. A special constructor takes a function pointer and pointers to the left hand and right hand operands. The callback function does the actual work of the operator, once C++ has decided where to put your object instance.
class Mat8x8 {
typedef void (*PFnInitMat)( Mat8x8& Mat, void* pLHS, void* pRHS );
Mat8x8( PFnInitMat Init, void* pLHS, void* pRHS );
public:
union {
double M[8][8];
double A[ 64 ];
};
Mat8x8() { memset( A, 0, 64*sizeof(double) ); }
Mat8x8( const Mat8x8& other ) { memcpy( A, other.A, 512 ); }
Mat8x8 operator + ( const Mat8x8& rhs );
Mat8x8 operator / ( double rhs );
};
typedef Mat8x8* PMat8x8;
typedef double* pdouble;
Mat8x8::Mat8x8( PFnInitMat Init, void* pLHS, void* pRHS )
{
Init( *this, pLHS, pRHS );
}
void AddMat88( Mat8x8& Mat, void* pLHS, void* pRHS )
{
Mat8x8& lhs = (Mat8x8&) *PMat8x8( pLHS );
Mat8x8& rhs = (Mat8x8&) *PMat8x8( pRHS );
for( int r=0; r < 8; ++r )
for( int c=0; c < 8; ++c )
Mat.M[r][c] = lhs.M[r][c] + rhs.M[r][c];
}
void DivMat88scalar( Mat8x8& Mat, void* pLHS, void* pRHS )
{
for( int i=0; i < 64; ++i )
Mat.A[i] = PMat8x8( pLHS )->A[i] / *pdouble( pRHS );
}
Mat8x8 Mat8x8::operator + ( const Mat8x8& rhs )
{
return Mat8x8( AddMat88, this, (void*)&rhs );
}
Mat8x8 Mat8x8::operator / ( double rhs )
{
return Mat8x8( DivMat88scalar, this, (pdouble)&rhs );
}
Eureka!
No more moving of initialized data. All you pass are a few pointers. ;)
In your operator, you just call the special constructor, and pass it a function pointer and pointers to the left hand and right hand instances. C++ finds a block for your instance and calls your initializer, which now can build the result straight into the memory the compiler chose.
All the operators can use the same technique, you just write suitable operation functions that you can pass to the constructor.
Afterthoughts
You may think this a lot of doing to get rid of a little data transferring, but what if you're dealing with big matrices of tens, or hundreds, of kBytes? Or you're dealing with massive amounts of instances? Little things add up. I hope you will find this technique useful, and that your programs will get faster for it..
I will certainly keep using 'My way'.
Using the Code
To make this scheme typesafe, you can change the callback function to take typed pointers or references instead of void
pointers, and use several different callback function types to handle different operand types. I just used void*
for this example - they make programming flexible (under great responsibility).
On the other hand, you could have a common callback function signature that you use for many classes, and cast them as you want.. the choice is yours. The special constructor should definitely not be public
though.
Points of Interest
It's a well known, and somewhat sad, fact that you cannot call virtual methods in C++ constructors. You could, however, use this same technique to solve some cases where your constructor cannot know in advance how to deal with some initializations.
History
- Original publication, April 2010
Finally retired after a multi-faceted career as a software developer, systems architect, and IT consultant. I've moved away from Europe, and now I live a quiet life in Thailand.
I've developed code for all levels of firm/software on IBM PC compatible machines. BIOS code, device drivers, hardware test routines, application framework libraries, relational databases, windows multimedia applications, and more.
Now I can pursue my private pet projects, some of which will likely show up here on the code project..