Introduction
This article is not an extensive guide to master the C++/CLI programming language, rather it is a quick start learning material that offers an easier way for an unmanaged C++ programmer to enter the world of managed programming, still sticking to C++. I hope that this article would prove useful for a C#, or VB.NET, or a pure managed programmer too to program in C++/CLI where the two programming worlds merge to offer the most powerful environment for programming.
Comparisons of C++/CLI with C#, VB.NET, or other .NET languages have almost not been made but if so, they are not made to win arguments but to show the difference, and to understand and appreciate gotchas and subtleties. There are absolutely no references in this article to the (obsolete) Managed C++. So, let us jump in!!!
Words Of Agreement
The word "unmanaged" in the broader sense encompasses any and all technologies (Win32, COM....) and programming languages (C++, VB, Pascal.....) prior to the inception of .NET. The word "managed" refers to the .NET technology itself and only those programming languages that support programming on the .NET platform. The words 'object' and 'instance' have been used interchangeably for the managed object.
.NET refers to or is the programming technology, platform, and standard. CLR [Common Language Runtime] is the implementation of .NET and is the runtime engine (platform) that programming languages generate IL (intermediate code) to get hosted against. CLR is the virtual processor that executes the IL generated by the various programming languages available for programming on the .NET platform. C++/CLI is the one [superior one] of them. The article, in its entirety, is an attempt to start learning the same.
For the content of this article, C++ means the ANSI-ISO C++ (originally conceived by Bjarne Stroustrup). It is for programming in the unmanaged world, and cannot be used for programming on the .NET platform. C++/CLI is not the same, and the article will delve in more detail about that. It must be considered as an entirely different language whose subset is the features and facilities of the ANSI-ISO C++. For the content of this article, unmanaged refers to programming through C++, although it is equivalent to doing through any of the other unmanaged programming languages.
Unmanaged Programming Brief
We have to reap what we sow. I mean, in C++ (unmanaged world), if you allocate memory by new
/malloc
, then it is your responsibility to deallocate them using delete
/free
. Forgetting to deallocate the allocated memory after the formal consumption results in memory leaks. The compiler is tightly bound to the underlying operating system/hardware, and uses the APIs exposed by the underlying OS for programming.
Managed Programming Brief
Programming in the managed world comprises of the programming language used, the libraries [called the Base Class Library], and the CLR itself. The BCL is the gateway to the platform on which the program will be executed. The BCL provides all the APIs for programming, and is organized under various namespaces corresponding to the service intended - file system, memory, network, user interface, process and threads etc. One of the several facilities in managed programming is automatic memory management - allocation is our wish, de-allocation is automatically taken care of by the CLR by a process called Garbage Collection.
Types in the managed world are entities that bear information and on which operations are carried by calling methods. Each type is unique by itself. For using the types, we create instances of types and work with it. Types (and their associated operations) are packaged and deployed as assemblies. An assembly is the ultimate unit of deployment, and is the building block of a CLR based application. An assembly is versioned, which serves as its identity. An assembly is similar to the dynamic link library for the unmanaged world, although assemblies are themselves dynamic link libraries or executables. Types packaged in an assembly are accessible from outside based on the accessibility marked for the type. For instance, a class type marked public
is accessible from outside, and so are its methods that are marked public
.
What is C++/CLI?
I know that might sound a boring start. But C++/CLI needs a formal introduction. ANSI/ISO C++ is the programming language for programming on Windows. The .NET is a new platform which is not tied to the hardware unlike before. It has its own execution engine, a virtual processor, which is the CLR. While C++ generates an executable for the target platform, the managed programming languages generate IL code (Intermediate Language) for the CLR. Programming languages are required to be complaint with CLI [Common Language Infrastructure] and CTS [Common Type System] to be used for programming in the managed world.
ANSI/ISO C++ can not be used to program on the .NET platform since it is not compliant with the CLI/CTS. Hence, C++/CLI is a new language [like C++ for C] that has been invented to program on the .NET platform. Though the syntax, grammar, and some of the rules are same like C++, it must not be considered just an extension over C++. Instead, C++ is a subset of C++/CLI, which is not the ultimate intent of the invention.
C++/CLI is a secular programming language, which means it can be used for managed or unmanaged or mixed programming. Hence, legacy code that cannot be ported to the .NET platform (using C# or any other .NET language of choice) in a short time span can be easily ported with C++/CLI. Also, any new code in such legacy C++ projects can be written as pure managed code. It also bridges the gap for the pure managed languages which are otherwise handicapped in using unmanaged code. So, your C# project can now use your complex algorithms or the bunch of hi-fi utilities written in ANSI C++ just with a C++/CLI wrapper over them.
Types and Object Creation
There are three data types in C++/CLI - reference, value, and native types. Native types are those that already exist with C++, say int
, float
, class
, struct
etc. An instance of these types is allocated on the stack when created statically. When created dynamically (using the new
keyword), they get allocated on the heap. It is the responsibility of the programmer to delete
the allocated instance. Now, you as a C++ programmer, might be well aware of the consequences if you fail to delete. So scary....memory leaks !!!
Value Types and Reference Types are a part of the managed world. They behave as the CLI dictates, the prime doctrine being to have a common base type, System::Object
. Following are the methods exposed by System::Object
:-
Method Name
|
Return Type
|
Accessibility
|
Equals
|
bool
|
public
|
GetType
|
Type
|
public
|
ToString
|
System::String^
|
public
|
GetHashCode
|
int
|
public
|
Finalize
|
-
|
protected
|
MembewiseClone
|
System::Object^
|
protected
|
ReferenceEquals
|
bool
|
public static
|
From a quick look, it can be understood that those are information required to abstract any type. And so is every type derived from System::Object
.
Value Types are derived from System::ValueType
, which is further derived from System::Object
. The value types are always allocated on the stack, although there are times when they are transported to the heap. We will see that later. But the very nature of a Value Type is to get allocated on the stack. All primitive types and structs are Value Types. They bear certain similarities with the primitive native types but then they are not the same.
Primitive Types Mapping
[List not extensive]
Data Type Name
|
Type
|
Keyword
|
Integer
|
System.Int32
|
int
|
Double
|
System.Double
|
double
|
Character [2 bytes]
|
System.Char
|
char
|
Character [1 Byte]
|
System.Byte
|
byte
|
Boolean
|
System.Boolean
|
bool
|
Following is the way primitive value types are declared and used:-
void InSomeMethod()
{
System::Int32 oddNumber = 1;
CallAnotherMethod(oddNumber);
System::Char character = 'A';
CallMethod3(character);
}
Following is the way user defined value types are declared and used:-
ref struct DateTimeInfo
{
private: System::Int32 Year;
private: System::char Month;
private: System::char Date;
public: DateTimeInfo(int year, char mon, char date)
{
}
public: int GetYear()
{
return this->Year;
}
public: int GetMonth()
{
return static_cast<int>(this->Month);
}
public: int GetDate()
{
return static_cast<int>(this->Date);
}
};
Classes are Reference Types. Instances of Reference Types are never allocated on the stack. They are always allocated on the heap, and this heap is not the same heap where your native types are allocated. This is a different area called the managed heap. Your native type or code has no idea or direct reach to this area. So then, how do we allocate on the managed heap? Is it by using the new
keyword? If so, how does the new
keyword know where to allocate then? To answer these, there is a newer keyword called gcnew
. 'new
' allocates on the native heap, and gcnew
allocates on the managed heap. Examples and code snippets are not appropriate yet, but just consider this for now:-
reference_data_type objRef = gcnew appropriate ctor of reference_data_type
This is the conventional way of creating a managed object in C++/CLI. As said above, the instance is created on the managed heap. The accessor for that instance is called the object reference (objRef
above) or handle, and it is available on the stack. Hope you can imagine that and agree. The C++/CLI convention is to call them handles. But I am going to call them object reference, which is the widely used term in the managed world. The term object reference must not in any way be related to the C++ reference. So, the word reference in the rest of the article refers to the managed object reference only, unless and until explicitly distinguished. The instance cannot be accessed without the object reference. In essence, object references are address holders. But, they are not like native pointers. Object references are type aware, polymorphic, and exhibit the type's behavior. They are not just addresses unlike pointers. Object references cannot be cast to any type desired, or moved by incrementing or decrementing the address, unlike pointers. So, they are much more intelligent address holders unlike native pointers. There can be more than one reference for the instance on the managed heap. So, an assignment of an object reference to another is a shallow copy. This may be news to native C++ programmers. So, what is this? There are now two references referring to the same instance on the heap, and so if one gets out of scope or is deleted, the instance is scratched, leaving the other reference dangling. Typical C++ programmer's nightmare, which might have been learnt the hard way.
Turn of the century...........GC!!! Yes, there can be more than one object reference referring the object on the heap. The managed programming model does not expect the programmer to do memory reclamation. It is not required that the programmer write code like delete objRef
to deallocate and return back the memory he allocated. Spare our poor programmers. The CLR is very smart, and reclaims memory through a process called Garbage Collection. The Garbage Collector reclaims only those instances that are not reachable viz., for which you lose the object references (like objRef
above). If the object reference goes out of scope, or if it is assigned null, the instance it was referring to cannot be reached through this reference anymore. So, for an instance memory to be reclaimed by the GC, there must be no outstanding references. This is the most compelling feature of .NET. Programmers are now free of the burden to write code to delete the memory they allocate, which has been the tough schooling they have gone through in the several years of programming. But beware.......that too much freedom results in chaos. Even with GC, memory has to be allocated wisely. Undisciplined allocations, on the fact that you are not responsible for deallocations, will result in poor performance of the application. This is one of the fundamental differences between native and managed worlds. All the other concepts and rules are based on reference types, object references, and garbage collection. Besides that, there is a subtle thing to be aware of. The GC is responsible only for deallocating the memory and not the resource. For instance, if you have opened a connection with a database, the GC is not responsible for closing the connection; instead it is responsible only for reclaiming the memory allocated for the database object.
With the basics learnt in the previous sections, it is time to see stuff that works. Following is a snippet of a C++/CLI class:-
ref class Directory
{
public: Directory();
public: Directory(System::String^ filePath);
public: File^ GetFile(System::String^ fileName);
public: cli::array<System::String^, 1>^ GetFiles();
public: cli::array<System::String^, 1>^ GetFiles(System::String^ filter);
public: System::Void DeleteFile(System::String^ fileName);
~Directory();
!Directory();
};
The above class shall be used for explanatory purposes - keywords, usage, or related concepts.
Declaring and Consuming A Managed Class
Listing 1 is the typical way of declaring a managed class in C++/CLI. The ref
keyword preceding the class
keyword distinguishes it as a managed class and as a candidate for getting allocated on the managed heap only. Let us see how to create an instance of the above class:-
Directory^ sysDir = gcnew Directory();
The caret [^] symbol specifies that the variable sysDir
is a managed handle [sorry, a reference to a managed object]. sysDir
is the object reference that you can now use to access the allocated object. You can call public methods, and you can copy the reference to another reference variable.
Directory^ sysDir2 = sysDir;
Now, sysDir
and sysDir2
both refer to the same instance. It is not required from your side to explicitly delete the object as you used to do with C++. But the effect of calling delete
on the instance (delete sysDir
) is discussed in the next section. The memory reclamation part is now a responsibility of the .NET runtime (GC). This is really a big relief for the programmer. Following is the way you invoke methods on the Directory
instance:-
File^ someFile = sysDir->GetFile("SomeFile.TXT");
Consider the following method:-
System::Void UseSysDir(Directory^ dirObjRef)
{
cli::array<system::string^,>^ files = dirObjRef->GetFiles();
}
</system::string^,>
The object reference now can be passed to methods as parameters, and can be accessed the same way inside the methods too. All of the references are to the same instance on the managed heap. There is no copy construction involved anywhere since a copy of the object is not created. It is similar to passing pointers in C++. In case you need to create a copy, you can derive your class from the System::IClonable
and implement the Clone()
method. The actual depth of the copy depends on the specific object, and each inner object may or may not require a Clone
method in turn. It might be very hard at first for a C++ programmer to accept this, and pass around references for the same object, while you might have learnt to implement a copy constructor, but in the due course of programming, you will definitely learn that programming with objects on the heap and the memory reclamation by garbage collector is a different model altogether.
Consider the following code below:-
Directory^ CreateDirectory(System::String^ dirPath)
{
Directory^ dirObj = gcnew Directory(dirPath);
return dirObj;
}
An instance of the Directory
class is created, and a reference to the allocated instance is returned. After returning, the dirObj
no more will refer to the object on the heap. It is the responsibility of the calling method to grab the returned reference and preserve it so that this is not spotted by the GC as orphaned or garbage. When there is at least one direct or indirect object reference that refers a particular object, the GC will not attempt to reclaim the memory being consumed by that object.
Abstract Classes
In simple terms, the keyword abstract
decorated on a managed class makes it abstract. This is a convenient way of making classes abstract without declaring abstract (pure virtual) methods. Also, methods can be decorated with the abstract
keyword, in which case the containing class must also be decorated the same way. Following are explanatory code snippets:-
ref class AnAbstractClass abstract
{
};
or
ref class AnotherAbstractClass abstract
{
public: virtual void SomeMethod(int x) abstract;
};
ref class DerivedFromAnotherAbstractClass : public AnotherAbstractClass
{
public: virtual void SomeMethod(int x) override
{
}
};
nullptr
A managed object reference is similar to a C++ pointer in one way; it can refer an object, or it refers to nothing. When a C++ pointer is NULL
, it does not point to any location in the memory. Similarly, when an object reference does not point to any object, its value is nullptr
. nullptr
is a keyword in C++/CLI. It is not a type like int
or float
. It is an indication to say that the object reference does not refer any object. Since it is not a type, no type operations can be done on nullptr
- sizeof(nullptr)
, throw nullptr
etc. will all result in compiler errors.
- A
nullptr
can be assigned to an object reference as part of the declaration or later.
Directory^ dirObjRef = nullptr;
- A
nullptr
can be explicitly assigned even when the reference is referring to some other object.
Directory^ dirObjRef = gcnew Directory(some directory path string);
dirObjRef->DeleteFile("SomeFile.TXT");
dirObjRef = nullptr;
- A
nullptr
can be used for comparing with an object reference, but other arithmetic operators (+, -, >, < etc.) are not allowed.
if (dirObjRef == nullptr) { throw some exception or as you wish.... }
if ( dirObjRef != nullptr) { .... }
- A
nullptr
can be passed to methods as parameters, and can return values too.
dirObjRef->GetFile(nullptr);
and
File^ GetFile("SomeFile.TXT")
{
return nullptr;
}
- A
nullptr
can be assigned to a managed reference, interior pointer (discussed later), or a native pointer.
Boxing/Unboxing
As we saw earlier, the value types are allocated on the stack. But there are times when they are present on the managed heap. For instance, when a method takes a System::Object
(mother of all managed types) as the parameter for, say, printing the contents, an object is allocated on the heap with the value of the Value Type copied to it. This process is called boxing. Sample code that shows boxing:-
int i =100;
System::Object^ boxObj = safe_cast<System::Object^>(i);
void PrintContents(Object^ objRef)
{
Console::WriteLine("Contents: {0}", objRef->ToString());
}
InSomeMethod()
{
int i = 100;
PrintContents(i);
}
The opposite of boxing is called unboxing, and it is retrieving the value of the instance from the heap and loading it on the variable on the stack.
void Unbox()
{
System::Object^ integerObj = safe_cast<System::Object^>(100);
int i = safe_cast<int>(integerObj);
}
Boxing and unboxing are applicable only for value types.
Object Destruction
This is a very fuzzy but interesting area. In C++, the need for a destructor is to do cleanup operations on the object before the memory is reclaimed. And the destruction of the object is deterministic. The destructor for an object allocated on the stack is called when it goes out of scope. For an object allocated on the heap, it is called when delete
is called. If you fail to call delete
(after the formal consumption of the object), the destructor is never called and the memory held by the object is not released - memory leaks. Now we know that entire story.
Ideally, there are no destructors for managed objects since the destruction of such objects is not deterministic. The garbage collector reclaims the memory held by the object at an arbitrary point of time (and on an arbitrary thread). If that is the case, then what is the way to do cleanup operations on the object, even though you do not care about memory being reclaimed? There is a way.
When you are done using an object, there are two ways available for cleanup - explicit and finalization. If you know that you are done with the object and want to do cleanup explicitly, then the .NET advice is this - implement the System::IDisposable
interface for your object. Call the Dispose
method on the object to do the (explicit) cleanup, and leave the memory reclamation part to the GC.
The problem with the Dispose
method is that it has to be called explicitly. If you fail, then the cleanup will not be done. There is another point in the lifetime of an object when you have the last chance to do cleanup, even after you have given up all the references to the object. That is when the object gets finalized. When the garbage collector finds an orphaned or garbage object, it adds that object to a special queue (called the Finalization Queue), and another thread, called the 'Finalizer Thread', calls a Finalize()
method on each of the queued objects. This process is called Finalization. What I have said about finalization is unimaginably succinct. There are more finer and intricate details which require a separate article (or probably a book). But, whatever said is sufficient for now. The Finalize()
is the last method call on an object in its lifetime; after that the object vanishes. The Finalize
method has got a pet name - the finalizer. You can do your cleanup in your finalizer. Again, there is a caveat, caveat, caveat!!!
The Finalize
method is called at an arbitrary point in time and on an arbitrary thread. There is no order in which the finalizers are called. If object A contains object B, it is not necessary that the finalizer for object B is called first. The order is not guaranteed. Then, what good is a finalizer for? Theoretically, it is for releasing unmanaged resources that the object may contain. Unmanaged objects are not collected by the GC. They exist until they are deleted. I say, theoretically, because you may be clever and disciplined to release them in your Dispose
itself (hoping that you call Dispose
, and such releasing is possible in your case).
So, what do we do if we called Dispose
, and also if the finalizer is called. It might be disastrous (in your case) to cleanup more than once. How do we avoid redundant cleanups? There is a (design) pattern suggested by .NET programming for overcoming this situation - Dispose Pattern. The idea is to prevent the finalizer being invoked if you have already called Dispose
. The garbage collector is exposed via the System.GC
class. You can call the GC.SupressFinalize
to suppress the finalizer from being called. Following is the Dispose pattern implementation snippet:-(object reference - naturally 'this' in the current case)
ref class SomeClass : IDisposable
{
public: void Dispose()
{
Dispose(true);
System::GC::SupressFinalize(this);
}
public: void Finalize()
{
Dispose(false);
}
protected: void Dispose(bool safe2FreeManaged)
{
if (safe2FreeManaged)
{
InternalDispose();
}
}
private: void InternalDispose()
{
}
};
The above theoretical implementation syntax is C++/CLI, but in real time, they are done in a bit different way. That is what we discuss next. All that discussed above is the general principle of .NET. It is the behavior exhibited by any managed object written in any language supported on the .NET platform, although each language can cloud and disguise the principle as fits the language. C++/CLI is clever and smart in this case. In C++/CLI, it is not required to explicitly derive from System::IDisposable
and implement the Dispose
method. Instead, the conventional destructor syntax is analogous to the InternalDispose
method. When you implement a destructor using the conventional C++ destructor syntax (~ClassName
), the compiler automatically derives the class from System::IDisposable
, implements the Dispose pattern for you, and assumes the destructor of the class as the clean up code. If there is no destructor for a class, then it is not derived from System.IDisposable
, and there is no Dispose pattern. C++ programmers do not have to feel the impact of the heavy discussion of the principle or pattern above. All that was said was to know the behind-the-scenes. The managed class looks the same way as an unmanaged class in that aspect, and the destructor is invoked when an object goes out of scope. Following is the way C++/CLI takes care of implementing the Dispose pattern for you:-
Dispose(bool safe2FreeMgd)
{
if(safe2FreeMgd)
{
try
{
}
finally
{
GC::SuppressFinalize(this);
}
}
else
{
}
}
Now, take a look at this:-
System::Void SomeMethod(if you need parameters)
{
Directory dirObj;
dirObj.DeleteFile("System.TXT");
}
That is another way of declaring/allocating an instance of the Directory
class. It resembles the conventional stack allocated object creation in C++. In this case, the syntax is same, but actually the object (referred by dirObj
) is allocated on the managed heap. And when dirObj
goes out of scope, the compiler inserts the call to the Dispose
method or the destructor of the class. Notice that there is no cap (^
) for the dirObj
declaration. Also, notice that the members are accessed by a .
(dot) operator instead of a ->
operator. This gives a picture as if the object is allocated on the stack. But, remember, no reference type object is allocated on the stack. Cool, and ultimately cool. This is one of the cool features that provide backward compatibility for the syntax, and it shows that the language designers have respect for the habits of C++ programmers.
Oh...we forgot the finalizer. The finalizer is declared with a !
(instead of a ~
), followed by the class name. The following code shows a finalizer:-
ref class SomeClass
{
!SomeClass()
{
}
};
And the finalizer is called only if the destructor is not called. Before I move on to the next topic, a kind advice - do not rely on a finalizer unless and until it is going to save your life. Besides the heavy performance impact that a finalizer makes, there are some very ill effects that it brings, which is not a part of this article.
Mixed Mode
Mixed mode programming is the absolute power of C++/CLI, and so is C++/CLI the superior and mightiest of all programming languages. C++/CLI is to C++, as it is to C. You can do C programming in C++. In the same sense, you can do unmanaged C++ programming in C++/CLI without using any of the managed features, not even a managed class. (I would imagine a reason to do that sort of a thing for the rest of my life.) Also, you can do pure managed programming without using any of the unmanaged practices. You can also do mixed mode programming, which means you can write an application that has both managed and unmanaged classes interacting with each other. So, a managed object can contain or interact with an unmanaged object, and vice versa. Could you imagine the power of programming that you can unleash with C++/CLI? The simplest application of the above power is when you want to port your existing hi-fi image processing or math library written in C++ to work on the .NET platform. And, when you have not enough budget/time to rewrite it in C# (or VB.NET, would you try that), you can recompile your existing code with C++/CLI and write a (managed) wrapper so that they can be used by any .NET programming language. It does not take much effort to write a wrapper when you compare the effort of rewriting and testing it. Following is a managed class that interacts with an unmanaged object:-
ref class ManagedClass
{
private: UnmanagedClass *unmgdPtr;
public: ManagedClass(UnmanagedClass *unmgdClassPtr)
{
System::Diagnostics::Debug::Assert(unmgdClassPtr != nullptr);
this->unmgdPtr = unmgdClassPtr;
}
public:
};
class UnmanagedClass
{
public: UnmanagedClass()
{
}
public: void SomeUnmgdMethod()
{
}
};
Likewise, an unmanaged class can bear a managed reference as its member and can invoke methods on it. But like a managed class holding the pointer to unmanaged, it cannot directly have the reference; instead, it is the following way:-
class UnmanagedClass
{
private: gcroot<ManagedClass^> mgdRef;
public: UnmanagedClass(ManagedClass^ mgdClassRef)
{
Debug::Assert(mgdClassRef != nullptr);
this->mgdRef = mgdClassRef;
}
};
gcroot
is itself an unmanaged entity which has the ability to refer to managed entities. So, an instance of gcroot<managed>
can be a statically or dynamically allocated member inside the unmanaged class.
Equality and Identity
Two managed objects are said to be equal if their values are same. The System::Object
's Equals
method can be used to test equivalence. The Equals
is an instance virtual method, and can be overridden in a derived class/struct since equality of compound objects depends on the type. Two managed objects are said to be identical if their references point to the same object on the heap. The System::Object
's ReferenceEquals
static method can be used to test identity.
The crux of CLI is the importance of a type of an object. Unlike unmanaged objects, managed objects know who they are, right from the moment they spring to life, either on the stack or on the heap. The type information of any type can be obtained by using the typeid
operator (TypeName::"%22http://msdn2.microsoft.com/en-us/library/fyf39xec(VS.80).aspx%22" target=""_blank"">typeid
) and using System::Object
's GetType
method for the instances. The importance of the type can be realized if you try the GetType
in the constructor. You will be stunned to realize it returns the type of the instance being constructed. For instance, in the following case:-
ref class SomeClass
{
public: int X;
public: int Y;
public: SomeClass(int x, int y)
{
Console::WriteLine("Type - {0}", this->GetType()->ToString());
Method();
}
public: virtual void Method()
{
Console::WriteLine("SomeClass::Method");
}
};
ref class SomeOtherClass : public SomeClass
{
public: SomeOtherClass(int x, int y) : SomeClass(x, y)
{
}
public: virtual void Method() override
{
Console::WriteLine("SomeOtherClass::Method");
}
};
The highlighted Console::WriteLine
will output the type of the instance being created, and not SomeClass
always. That is, if an instance of SomeOtherClass
is created, you will see SomeOtherClass
in the output. Also, you will be thrilled to know that the virtual calls in the constructor are directed to the appropriate overrides. This, of course, is not recommended usage, and is not a good discipline. It is just being pointed to understand the importance of a Type
.
Declaring Properties
There is an easier and very elegant way (for the user) in C++/CLI for writing get
/set
methods. A Property is a getter and/or setter construct exposed on a class. The accessibility of the getter and setter of the property can be chosen as per the needs. For instance, it is possible to write a property that has a public
getter but private
or protected
setter.
Let us say we have a Status
class, and it has a few parameters, some of which are writable, some only readable, and some both readable and writable. Following is the code snippet for the above assumption:-
public ref class Status
{
private: float pressureValue;
private: int temperatureValue;
private: DateTime recordDateTime;
public: Status()
{
this->RecordTime = DateTime::Now;
}
public: property float Pressure
{
float get()
{
return this->pressureValue;
}
void set(float pval)
{
this->pressureValue = pval;
}
}
public: property float Temperature
{
float get()
{
return this->temperatureValue;
}
protected: void set(float tval)
{
this->temperatureValue = tval;
}
}
public: property DateTime RecordTime
{
DateTime get()
{
return this->recordDateTime;
}
private: void set(DateTime dtval)
{
this->recordDateTime = dtval;
}
}
};
Users of the Status
class write code as shown below:-
ref class UserClass
{
public: void LogPressure()
{
Console::WriteLine("Pressure: {0}", statusObject->Pressure);
}
public: void SetPressure(float pval)
{
statusObject->Pressure = pval;
}
};
Properties are an elegant way of reading and writing data members of a class. When using properties, the client code seems to have a close connection with the class exposing them, as if accessing the data members. Properties can be declared on a class, struct, or interface. So, they can be virtual – either get
or set
or both. Properties can be static
too, and the static
applies to the property as a whole.
Besides, there is something called an Indexed property. It is essentially a property that provides an indexing operator for the class. The indexing can be multi-dimensional. For instance, consider a class named Manager
that has an array of Reportee
s as a member:-
public ref class Reportee
{
private: String^ reporteeName;
public: Reportee(String^ name)
{
this->Name = name;
}
public: property String^ Name
{
String^ get()
{
return this->reporteeName;
}
private: void set(String^ name)
{
this->reporteeName = name;
}
}
};
public ref class Manager
{
private: cli::array<reportee^>^ reporteeList;
public: property Reportee^ default[int]
{
Reportee^ get(int index)
{
if (index >= 0 && index < reporteeList->Length)
{
return this->reporteeList[index];
}
return nullptr;
}
protected: void set(int index, Reportee^ robj)
{
if (index >= 0 && index < reporteeList->Length && robj != nullptr)
{
this->reporteeList[index] = robj;
}
}
}
public: property int ReporteesCount
{
int get()
{
return this->reporteeList->Length;
}
}
};
Users of the Manager
class write code as shown below:-
ref class SomeUserClass
{
public: void LogReporteeInfo(Manager^ mgr)
{
for (int i =0; i < mgr->ReporteesCount; ++i)
{
Console::WriteLine("Reportee {0}: {1}", i + 1, mgr[i]->Name);
}
}
};
With the use of properties, methods like GetSomeValue
and SetSomeValue(Value)
are replaced by the short, sweet, and elegant obj->PropertyName
and obj->PropertyName = SomeValue
syntax. It is very much recommended that properties be used for only getting and setting the corresponding entity of the class, and avoid other unrelated operations.
enums
The following is a typical declaration of a managed enumeration:-
enum class Color
{
Black,
Red,
Blue,
Green
}
The first thing about managed enumeration that differentiates them from the unmanaged enumerations is that managed enumerations must have names. Anonymous managed enumerations are not supported. The other important distinguishing thing is that managed enumerations are scoped, which means that a values must be accessed using their enclosing enumeration name; and two enumerations can have the same value name. The default underlying type of an enumeration is integer, but of course, that can be chosen among signed and unsigned integers (int
, short
, long
), char
, or bool
. Following is a sample showing an enumeration whose underlying type is bool
:-
enum class Response : bool
{
Positive = true,
Negative = false,
OK = true,
Cancel = false,
Yes,
No
}
Strings
There has never been a type for string literals in C++. For instance, the type of 2 is int
, the type of 's' is char
. But, there is no inherent type for "Hello World" in the language. It can be accessed as char *
or const char *
. But that is not the type of the string literal. In the later years of evolution, the language provided the efficient and easy STL, which has a std::string
class for creating and managing strings. Even then, std::string
is not its type. So, when "Hello World" is passed as an argument for a method:
int StringTest(std::string);
it requires a conversion (using the ctor). All I am trying to say is, there is no inherent type for a string literal in C++, unlike in C#, where the System.String
is the type for string literals; methods can be invoked on string literals directly - "Hello World".Length
gives 12.
C++/CLI, being new and state-of-the-art might, gives you hope for what was not available in the language for years. Your eager expectations - System::String
as type for string literals - have to be given up. I am sorry to disappoint you. Still, there is no type for string literals. But since C++/CLI is a secular [managed/unmanaged] programming language, there are some interesting things to be noted.
String literals in C++/CLI have the flexibility of associating themselves with (the closest) managed or unmanaged types, based on the context, and of course, managed types being given the priority. So, "Hello World" can be treated as System::String
or const char *
or char *
. Let us learn that with an example:-
int StringTest(const char *);
int StringTest(System::Object^ strObject);
int StringTest(System::String^ clrString);
int StringTest(std::string stdString);
And, guess to which of the above methods will the following call bind to ?
StringTest("Hello World");
The above call will bind to the System::String^
overload. As I said earlier, managed types are given higher priority in string contexts. In the absence of the System::String^
overload, the call will be bound to the overload with System::Object^
as argument. And, the unmanaged const char *
will be considered in the absence of both the managed types. Besides that, even among managed types, only those that are found closest to the adopted string literals are considered; when none are found compatible, the const char *
overload takes precedence. And types that require conversion (using conversion operators or explicit ctors) are the last options, which is the case with the std::string
overload.
So, what do you think will happen with the following line of code - compilation error, run-time error, runs fine?
int hc = "Hello World"->GetHashCode();
Guesses apart, the above line of code will result in a compilation error. No, don't try to replace the ->
with .
(dot). It is not that. The compiler finds no context like a method call to match the type of the string literal to an existing type, which should convince you that there is no inherent compiler type for string literals. Period. All the different flavors of type matching for string literals may help us build a C++ world where "Hello World"s are one day System::String
. So, try to write code (as much) that binds to System::String
.
Arrays – Not [] But cli::array<t>^</t>
A second relief for C++ programmers is maintaining arrays. Array in the past was not a type, and had to be managed by the programmers themselves. So, the programmer had to be aware of the array boundaries, range check during access, and such things. There comes an array type with C++/CLI. It is a type, and any managed array is an instance of the cli::array
class, which by itself is a reference type. It can hold a fixed number of value or reference types; fixed refers that the size of the array is determined at creation time and cannot be changed after creating. Following are the typical ways of allocating an array of integers:-
cli::array<int><int>^ intArray = gcnew cli::array<int>(10);
cli::array<int><int>^ intArray = gcnew cli::array<int>(10) { 0, 1, 2 , 3, 4 };
cli::array<int><int>^ intArray = gcnew cli::array<int>^(10);
for (int i = 0; i < 10; ++i)
{
intArray[i] = i + 1;
}
cli::array<SomeRefType^>^ arrayOfRefs = gcnew cli::array<SomeRefType^>(10);
for (int i = 0; i < 10; ++i)
{
arrayOfRefs[i] = gcnew SomeRefType();
}
- The individual values of an array are boxed if they are value types.
- Array indices are zero based.
- The
array
type has methods for accessing and manipulating the contents of the array.
- All operations on the
array
are bound checked. Any access beyond the maximum size of the array results in an exception - "Index out of range".
- Arrays get allocated only on the heap; hence an array of value types gets all its values boxed to the heap.
Note: cli::array
in C++/CLI is the emissary of the Array
type in BCL. For dynamically growing arrays, use System.Collections.ArrayList
or any of the generic collections in the BCL.
interior_ptr<type>
The GC follows a contiguous mode allocation pattern for allocating memory. Compaction occurs (just like a disk defragmenter) whenever GC reclaims memory from garbage objects. Doing so changes the addresses of the objects that escaped the collection. But the GC updates the already existing live references to point to the newly moved locations. Well, such an update cannot be made on a native pointer. We require a pointer-like but it's superset. That means it must be able to point to a native or managed object, with a seamless syntax. It must allow all operations, arithmetic too, if it points to a native object. The dream has come true. We have the interior_ptr
.
- An
interior_ptr
can point to a member of a reference type, element of a managed array, or any native object compatible with a native pointer.
- An interior pointer can only be declared on the stack. So, it cannot be declared as a member of a class. They can be local variables or method parameters.
- A method with
interior_ptr
, instead of an equivalent native counterpart, has the advantage of a seamless syntax, and works the same way.
Following is an example of using an interior_ptr
:-
ref class MgdClass
{
public: int dmNumber;
};
class UnmgdClass
{
public: int dmNumber;
};
void UserMethod()
{
MgdClass^ objRef = gcnew MgdClass();
interior_ptr<mgdclass^> ip1 = &objRef;
(*ip1)->dmNumber = 100;
interior_ptr<int> ip2 = &(objRef->dmNumber);
*ip2 = 200;
UnmgdClass *umObjRef = new UnmgdClass();
interior_ptr<unmgdclass> ip3 = umObjRef;
ip3->dmNumber = 500;
interior_ptr<int> ip4 = &(umObjRef->dmNumber);
*ip4 = 600;
int num = 1000;
interior_ptr<int> ip5 = #
*ip5 = 200;
}
A method that takes an interior_ptr
as a parameter instead of a raw pointer will have the flexibility to accept any of the interior_ptr
s declared above. Did you experience the seamless syntax there?
Generics
Like templates for C++, so are generics for C++/CLI. But, generics is a feature of the CLR, and C++/CLI has its own syntax (like C# and VB.NET) to expose it. The prime difference between templates and generics is that templates are a compile time feature and generics are runtime. That means, template classes and methods, at compile time, are converted into actual executable code (classes/methods) based on the types they are instantiated on. So, it is just a syntax for avoiding proliferate code. Template classes and methods are not identified as declared in code at runtime; instead have compiler generated names. But generics, though providing the facility of templates, are independent types themselves. The point of time of instantiation of a generic type is runtime. Until then, it exists in the assembly as one among the several types. Also, it can be exposed to the outside to be used. That means, unlike templates, generic types exist even when no code in the assembly uses them at the time of compilation. Since it is worth writing a book on this section, I shall conclude with an example generic class:-
public ref struct Stack
{
private: System::Collections::ArrayList^ stackElements;
public: Stack(int minSize)
{
this->stackElements = gcnew System::Collections::ArrayList(minSize);
}
public: generic<typename T> void Push(T item)
{
stackElements->Add(item);
}
public: generic<typename T> T Pop()
{
T item = safe_cast<t>(stackElements[stackElements->Count - 1]);
Debug::Assert(item != nullptr);
stackElements->RemoveAt(stackElements->Count - 1);
return item;
}
public: property int Size
{
int get()
{
return this->stackElements->Count;
}
}
};
The code shown above uses generic methods. Generic classes are also possible like template classes. Client code using the Stack<t>
class:-
void main()
{
Stack<int>^ integerStack = gcnew Stack<int>(10);
for (int i = 0; i < 10; ++i)
{
integerStack->Push(i + 1);
}
for (int i = 1; i < integerStack->Size; ++i)
{
integerStack->Pop();
}
}
void main()
{
Stack<int> integerStack(10);
for (int i = 0; i < 10; ++i)
{
integerStack.Push(i + 1);
}
for (int i = 1; i < integerStack.Size; ++i)
{
integerStack.Pop();
}
}
What is worth mentioning is that templates and generics can co-exist. Isn't it cool ? A template class can have generic classes and methods, but the other way round is not possible or allowed. Imagine why. I stop here on generics; click here to start learning more about generics.
The Beginning
Well, there is only way to conclude the article. And let me put it this way. C++/CLI is not uglier, but mightier and superior. The syntax might be a bit wild, and the concepts may be unconventional for a C++ programmer. But on the whole, the real power is unleashed by the capacity of the programmer. What we saw in this article has brought you only to the doors of power programming on the .NET platform. There is a lot lot more, and it is endless. I hope whatever we discussed here would have been useful, and would have kindled your interest to dwell further. And for such cases, MSDN is one of the best places that I would recommend.
History
- 6 Sep 2007 - Initial draft.