Click here to Skip to main content
15,912,072 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
I have a class that holds some big amount of data, called HeavyData. This class Follows the rule of three (It has overridden the copy-constructor, copy-assignment operator and the destructor to be able to copy the member variable someBigAmountOfData correctly when copying the class and to be able to free the class without causing memory leaks).

The DataManager class has two member variables of type HeavyData. (See Below)

C++
class HeavyData
{
public:
    HeavyData();

    HeavyData(const HeavyData& that);
    HeavyData& operator=(const HeavyData& that);
    ~HeavyData();

private:
    void* someBigAmountOfData; //maybe a few hundred bytes (on the heap, of course)
    size_t sizeOfData;
};


class DataManager
{
public:
    DataManager();

    //method 1
    DataManager(HeavyData one, HeavyData two):
        one(one),
        two(two)
    {
    }

    //method 2 (which I think is more effective than method 1)
    DataManager(const HeavyData& one, const HeavyData& two):
        one(one),
        two(two)
    {
    }

private:
    HeavyData one;
    HeavyData two;
};


THE PROBLEM :

The DataManager class has two constructors as follows:

1. DataManager(HeavyData one, HeavyData two); //method 1
2. DataManager(const HeavyData& one, const HeavyData& two); //method 2

The problem is in choosing a constructor from the above two. Which one do you think is more efficient ? And Why ?

In think that the 2nd constructor (method 2) is more efficient.
Posted

In fact neither of the two methods is very efficient. The reason is that in both cases the huge amount of data in your HeavyData objects must be copied and memory must be allocated. (Btw.: A few hundred bytes are these days not considered a huge amount of data). The HeavyData object itself is in comparison rather small. And yes, the second constructor avoids copying the HeavyData object onto the stack and is hence slightly faster.

BUT: There is one exception to what I said in the first sentence. If the HeavyData contains a smart mechanism called "lazy copy" this would avoid much of the copy work for the big data chunks until the moment that the data is going to be changed. Many string classes use that trick to avoid unnecessary copy operations. It is however not visible from your code whether you are intending to use this technique for the HeavyData class. It would require reference counting in your class.

Back to the main issue: Why is your overall class layout not very efficient? Because you are intending to hold the big-data objects not only inside your DataManager class, but also outside of it. Normally, one would try to hold all instances of HeavyData inside the DataManager class and give access to them via member functions. That makes more sense than to construct a HeavyData object outside the DataManager class (including allocating and filling the big data chunks) and then transfer that object to the DataManager via a relatively expensive copy operation.

Another viable approach would be to pass pointers to the HeavyData object around and hold only pointers to such objects in the DataManager class. That however makes it necessary to think about a clear ownership philosophy: Who is the owner of a HeavyData object, the creater or your DataManager? (Smart pointers might help you in dealing with that problem. I would not recommend to go that path if you are a novice in C++ programming.) Hence, in most cases it is easier to let the DataManager construct and destroy all the HeavyData objects.
 
Share this answer
 
For function arguments of any non-POD type you should always prefer passing them as const reference (i. e. option 2). Passing them by value creates temporaries even before DataManager member initialization. These copies will then (possibly*) be copied again.

*: In the given example, the compiler may be able to optimize away the duplicate copy, but there is no guarantee for that, much less if you later change the code and add actual instructions inside the constructor body that reference the function arguments.

There may be no difference (now), but there is definitely no downside to option 2, so that is clearly the better option.

I do agree though to nv3s advice to rethink your strategy of heavy-data copying.

Or, maybe, the cost of avoidable copying may not be so bad after all, and premature optimization in the sense you ask about may not even be necessary. It may be worth to first check if you actually need to optimize that code.
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900