C++/CLI in Action - Instantiating CLI classes

Nish Nishant

4.98/5 (27 votes)

Feb 26, 2007

CPOL

14 min read

91956

This is an excerpt from Chapter 1 that covers how CLI classes are instantiated, and discusses constructors and assignment operators

Title	C++/CLI in Action
Author	Nishant Sivakumar
Publisher	Manning
Published	March 2007
ISBN-10	1-932394-81-8
ISBN-13	978-1932394818
Price	USD 44.99

This is a chapter excerpt from C++/CLI in Action authored by Nishant Sivakumar and published by Manning Publications. The content has been reformatted for CodeProject and may differ in layout from the printed book and the e-book.

1.5 Instantiating CLI classes

In this section, you'll see how CLI classes are instantiated using the gcnewoperator. You'll also learn how constructors, copy constructors, and assignment operators work with managed types. Although the basic concepts remain the same, the nature of the CLI imposes some behavioral differences in the way constructors and assignment operators work; when you start writing managed classes and libraries, it's important that you understand those differences. Don't worry about it, though. Once you've seen how managed objects work with constructors and assignment operators, the differences between instantiating managed and native objects will automatically become clear.

1.5.1 The gcnew operator

The gcnew operator is used to instantiate CLI objects. It returns a handle to the newly created object on the CLR heap. Although it's similar to the new operator, there are some important differences: gcnew has neither an array form nor a placement form, and it can't be overloaded either globally or specifically to a class. A placement form wouldn't make a lot of sense for a CLI type, when you consider that the memory is allocated by the Garbage Collector. It's for the same reason you aren't permitted to overload the gcnew operator. There is no array form for gcnew because CLI arrays use an entirely different syntax from native arrays, which we'll cover in detail in the next chapter. If the CLR can't allocate enough memory for creating the object, a System::OutOfMemoryExceptionis thrown, although chances are slim that you'll ever run into that situation. (If you do get an OutOfMemoryException, and your system isn't running low on virtual memory, it's likely due to badly written code such as an infinite loop that keeps creating objects that are erroneously kept alive.) The following code listing shows a typical usage of the gcnew keyword to instantiate a managed object (in this case, the Student object):

ref class Student
{
    ...
};

...

Student^ student = gcnew Student();
student->SelectSubject("Math", 97);

The gcnew operator is compiled into the newobj MSIL instruction by the C++/CLI compiler. The newobj MSIL instruction creates a new CLI object—either a ref object on the CLR heap or a value object on the stack—although the C++/CLI compiler uses a different mechanism to handle the usage of the gcnew operator to create value type objects (which I'll describe later in this section). Because gcnew in C++ translates to newobj in the MSIL, the behavior of gcnew is pretty much dependent on, and therefore similar to, that of the newobj MSIL instruction. In fact, newobj throws System::OutOfMemoryException when it can't find enough memory to allocate the requested object. Once the object has been allocated on the CLR heap, the constructor is called on this object with zero or more arguments (depending on the constructor overload that was used). On successful completion of the call to the constructor, gcnew returns a handle to the instantiated object. It's important to note that if the constructor call doesn't successfully complete, as would be the case if an exception was raised inside the constructor, gcnew won't return a handle. This can be easily verified with the following code snippet:

ref class Student
{
public:
    Student()
    {
        throw gcnew Exception("hello world");
    }
};

//...

Student^ student = nullptr; //initialize the handle to nullptr

try
{
    student = gcnew Student(); //attempt to create object
}
catch(Exception^)
{
}

if(student == nullptr) //check to see if student is still nullptr
    Console::WriteLine("reference not allocated to handle");

Not surprisingly, student is still nullptr when it executes the if block. Because the constructor didn't complete executing, the CLR concludes that the object hasn't fully initialized, and it doesn't push the handle reference on the stack (as it would if the constructor had completed successfully).

NOTE C++/CLI introduces the concept of a universal null literal called nullptr. This lets you use the same literal (nullptr) to represent a null pointer and a null handle value. The nullptr implicitly converts to a pointer or handle type; for the pointer, it evaluates to 0, as dictated by standard C++; for the handle, it evaluates to a null reference. You can use the nullptr in relational, equality, and assignment expressions with both pointers and handles.

As I mentioned earlier, using gcnew to instantiate a value type object generates MSIL that is different from what is generated when you instantiate a ref type. For example, consider the following code, which uses gcnew to instantiate a value type:

value class Marks
{
public:
    int Math;
    int Physics;
    int Chemistry;
};

//...

Marks^ marks = gcnew Marks();

For this code, the C++/CLI compiler uses the initobj MSIL instruction to create a Marks object on the stack. This object is then boxed to a Marks^ object. We'll discuss boxing and unboxing in the next section; for now, note that unless it's imperative to the context of your code to gcnew a value type object, doing so is inefficient. A stack object has to be created, and this must be boxed to a reference object. Not only do you end up creating two objects (one on the managed stack, the other on the managed heap), but you also incur the cost of boxing. The more efficient way to create an object of type Marks (or any value type) is to declare it on the stack, as follows:

Marks marks;

You've seen how calling gcnew calls the constructor on the instance of the type being created. In the coming section, we'll take a more involved look at how constructors work with CLI types.

1.5.2 Constructors

If you have a ref class, and you haven't written a default constructor, the compiler generates one for you. In MSIL, the constructor is a specially named instance method called .ctor. The default constructor that is generated for you calls the constructor of the immediate base class for the current class. If you haven't specified a base class, it calls the System::Object constructor, because every ref object implicitly derives from System::Object. For example, consider the following two classes, neither of which has a user-defined constructor:

ref class StudentBase
{
};

ref class Student: StudentBase
{
};

Neither Student nor StudentBase has a user-provided default constructor, but the compiler generates constructors for them. You can use a tool such as ildasm.exe (the IL Disassembler that comes with the .NET Framework) to examine the generated MSIL. If you do that, you'll observe that the generated constructor for Student calls the constructor for the StudentBase object:

call instance void StudentBase::.ctor()

The generated constructor for StudentBase calls the System::Object constructor:

call instance void [mscorlib]System.Object::.ctor()

Just as with standard C++, if you have a constructor—either a default constructor or one that takes one or more arguments—the compiler won't generate a default constructor for you. In addition to instance constructors, ref classes also support static constructors (not available in standard C++). A static constructor, if present, initializes the static members of a class. Static constructors can't have parameters, must also be private, and are automatically called by the CLR. In MSIL, static constructors are represented by a specially named static method called .cctor. One possible reason both special methods have a . in their names is that this avoids name clashes, because none of the CLI languages allow a . in a function name. If you have at least one static field in your class, the compiler generates a default static constructor for you if you don't include one on your own. When you have a simple class, such as the following, the generated MSIL will have a static constructor even though you haven't specified one:

ref class StudentBase
{
    static int number;
};

Due to the compiler-generated constructors and the implicit derivation from System::Object, the generated class looks more like this:

ref class StudentBase : System::Object
{
    static int number;
    StudentBase() : System::Object()
    {
    }
    static StudentBase()
    {
    }
};

A value type can't declare a default constructor because the CLR can't guarantee that any default constructors on value types will be called appropriately, although members are 0-initialized automatically by the CLR. In any case, a value type should be a simple type that exhibits value semantics, and it shouldn't need the complexity of a default constructor—or even a destructor, for that matter. Note that in addition to not allowing default constructors, value types can't have user-defined destructors, copy constructors, and copy-assignment operators.

Before you end up concluding that value types are useless, you need to think of value types as the POD equivalents in the .NET world. Use value types just as you'd use primitive types, such as ints and chars, and you should be OK. When you need simple types, without the complexities of virtual functions, constructors and operators, value types are the more efficient option, because they're allocated on the stack. Stack access will be faster than accessing an object from the garbage-collected CLR heap. If you're wondering why this is so, the stack implementation is far simpler when compared to the CLR heap. When you consider that the CLR heap also intrinsically supports a complex garbage-collection algorithm, it becomes obvious that the stack object is more efficient.

It must be a tad confusing when I mention how value types behave differently from reference types in certain situations. But as a developer, you should be able to distinguish the conceptual differences between value types and reference types, especially when you design complex class hierarchies. As we progress through this book and see more examples, you should feel more comfortable with these differences.

Because we've already talked about constructors, we'll discuss copy constructors next.

1.5.3 Copy constructors

A copy constructor is one that instantiates an object by creating a copy of another object. The C++ compiler generates a copy constructor for your native classes, even if you haven't explicitly done so. This isn't the case for managed classes. Consider the following bit of code, which attempts to copy-construct a ref object:

ref class Student
{
};

int main(array<System::String^>^ args)
{
    Student^ s1 = gcnew Student();
    Student^ s2 = gcnew Student(s1); <<==(1)
}

If you run that through the compiler (1), you'll get compiler error C3673 (class does not have a copy-constructor). The reason for this error is that unlike in standard C++, the compiler won't generate a default copy constructor for your class. At least one reason is that all ref objects implicitly derive from System::Object, which doesn't have a copy constructor. Even if the compiler attempted to generate a copy constructor for a ref type, it would fail, because it wouldn't be able to access the base class copy constructor (it doesn't exist).

To make that clearer, think of a native C++ class Base with a private copy constructor, and a derived class Derived (that publicly inherits from Base). Attempting to copy-construct a Derived object will fail because the base class copy constructor is inaccessible. To demonstrate, let's write a class that is derived from a base class that has a private copy constructor:

class Base
{
public:
    Base(){}
private:
    Base(const Base&);
};

class Derived : public Base
{
};

int _tmain(int argc, _TCHAR* argv[])
{
    Derived d1;
    Derived d2(d1); // <-- won't compile
}

Because the base object's copy constructor is declared as private and therefore is inaccessible from the derived object, this code won't compile: The compiler is unable to copy-construct the derived object. What happens with a ref class is similar to this code. In addition, unlike native C++ objects, which aren't polymorphic unless you access them via a pointer, ref objects are implicitly polymorphic (because they're always accessed via reference handles to the CLR heap). This means a compiler-generated copy constructor may not always do what you expect it to do. When you consider that ref types may contain member ref types, there is the question of whether a copy constructor implements shallow copy or deep copy for those members. The VC++ team presumably decided that there were too many equations to have the compiler automatically generate copy constructors for classes that don't define them.

If you want copy-construction support for your class, you must implement it explicitly, which fortunately isn't a difficult task. Let's add a copy constructor to the Student class:

ref class Student
{
public:
    Student(){}
    Student(const Student^)
    {
    }
};

That wasn't all that tough, was it? Notice how you have to explicitly add a default parameterless constructor to the class. This is because it won't be generated by the compiler when the compiler sees that there is another constructor present. One limitation with this copy constructor is that the parameter has to be a Student^, which is OK except that you may have a Student object that you want to pass to the copy constructor. If you're wondering how that's possible, C++/CLI supports stack semantics, which we'll cover in detail in chapter 3. Assume that you have a Student object s1 instead of a Student^, and you need to use that to invoke a copy constructor:

Student s1;
Student^ s2 = gcnew Student(s1); //error C3073

As you can see, that code won't compile. There are two ways to resolve the problem. One way is to use the unary % operator on the s1 object to get a handle to the Student object:

Student s1;
Student^ s2 = gcnew Student(%s1);

Although that compiles and solves the immediate problem, it isn't a complete solution when you consider that every caller of your code needs to do the same thing if they have a Student object instead of a Student^. An alternate solution is to have two overloads for the copy constructor, as shown in listing 1.2.

ref class Student
{
//...
public:
    Student(){}
    Student(String^ str):m_name(str){}
    Student(const Student^) <<==(1)
    {
    }
    Student(const Student%) <<==(2)
    {
    }
};

//...

Student s1;
Student^ s2 = gcnew Student(s1);

Listing 1.2 Declaring two overloads for the copy constructor

This solves the issue of a caller requiring the right form of the object, but it brings with it another problem: code duplication. You could wrap the common code in a private method and have both overloads of the copy constructor call this method, but then you couldn't take advantage of initialization lists.

Eventually, it's a design choice you have to make. (1) If you only have the copy constructor overload taking a Student^, then you need to use the unary % operator when you have a Student object; and (2) if you only have the overload taking a Student%, then you need to dereference a Student^ using the * operator before using it in copy-construction. If you have both, you may end up with possible code duplication; and the only way to avoid code duplication (using a common function called by both overloads) deprives you of the ability to use initialization lists.

My recommendation is to use the overload that takes a handle (in the previous example, the one that takes a Student^), because this overload is visible to other CLI languages such as C# (unlike the other overload)—which is a good thing if you ever run into language interop situations. The unary % operator won't really slow down your code; it's just an extra character that you need to type. I also suggest that you stay away from using two overloads, unless it's a specific case of a library that will be exclusively used by C++ callers; even then, you must consider the issue of code duplication.

Now you know that if you need copy construction on your ref types, you must implement it yourself. So, it may not be surprising to see in the next section that the same holds true for copy-assignment operators.

1.5.4 Assignment operators

The copy-assignment operator is one that the compiler generates automatically for native classes in standard C++, but this isn't so for a ref class. The reasons are similar to those that dictate that a copy constructor isn't automatically generated. The following code (using the Student class defined earlier) won't compile:

Student s1("Nish");
Student s2;
s2 = s1; // error C2582: 'operator =' function
         // is unavailable in 'Student'

Defining an assignment operator is similar to what you do in standard C++, except that the types are managed:

Student% operator=(const Student% s)
{
    m_name = s.m_name;
    return *this;
}

Note that the copy-assignment operator can be used only by C++ callers, because it's invisible to other languages like C# and VB.NET. Also note that, for handle variables, you don't need to write a copy-assignment operator, because the handle value is copied over intrinsically.

You should try to bring many of the good C++ programming practices you followed into the CLI world, except where they aren't applicable. As an example, the assignment operator doesn't handle self-assignment. Although it doesn't matter in our specific example, consider the case in listing 1.3.

ref class Grades <<==(1)
{
    //...
};

ref class Student
{
    String^ m_name;
    Grades^ m_grades;
public:
    Student(){}
    Student(String^ str):m_name(str){}
    Student% operator=(const Student% s)
    {
        m_name = s.m_name;
        if(m_grades) [#2]
            delete m_grades; <<==(2)
        m_grades = s.m_grades;
        return *this;
    }
    void SetGrades(Grades^ grades)
    {
        //...
    }
};

Listing 1.3 The self-assignment problem

In the preceding listing, (1) assume that Grades is a class with a nontrivial constructor and destructor; thus, in the Student class assignment operator, before the m_grades member is copied, (2) the existing Grades object is explicitly disposed by calling delete on it—all very efficient. Let's assume that a self-assignment occurs:

while(some_condition)
{
    // studarr is an array of Student objects
    studarr[i++] = studarr[j--]; // self-assignment occurs if i == j
    if(some_other_condition)
        break;
}

In the preceding code snippet, if ever i equals j, you end up with a corrupted Student object with an invalid m_gradesmember. Just as you would do in standard C++, you should check for self-assignment:

Student% operator=(const Student% s)
{
    if(%s == this) //<<== Check for self-assignment
    {
        return *this; //<<== If it is so, return immediately
    }
    m_name = s.m_name;
    if(m_grades)
        delete m_grades;
    m_grades = s.m_grades;
    return *this;
}

We've covered some ground in this section—and if you feel that a lot of information has been presented too quickly, don't worry. Most of the things we've discussed so far will come up again throughout this book; eventually, it will all make complete sense to you. We'll now look at boxing and unboxing, which are concepts that I feel many .NET programmers don't properly understand—with not-so-good consequences.