Click here to Skip to main content
15,867,308 members
Articles / Programming Languages / C++

Smart pointers for single owners and their aliases

Rate me:
Please Sign up or sign in to vote.
4.82/5 (12 votes)
14 Apr 2014CPOL28 min read 36.9K   377   56   19
The missing link for complete memory and pointer safety in C++
C++
owner_ptr<T> apT = new T;    //apT is a safe exclusive owner
ref_ptr<T> rT = apT;        //rT is a safe observer

vector<owner_ptr<T, in_collection> > vT;    //a safe exclusive owner collection
ref_ptr<T> rT = vT[i];            //safe observer of an element 

Download source (xnr.zip) - 9.4KB - 3 files ( xnr_ptrs.h, xnr_lean_ptr_engine.h, win_refcount_pool.h)

[ 22/11/2012 - An oversight that left some old code that prevented complilation has been fixed ]

This article has been superceeded by:

A Sensible Smart Pointer Wrap for Most of Your Code

Please use the more mature version presented in the new article

Background

C++ is an excellent language that since its birth has carried with it a deficiency that has caused a great deal of trouble. The deficiency is a lack of proper closure in the language elements used to handle dynamically allocated objects. The new and delete keywords were introduced to encapsulate the creation and destruction of dynamically allocated objects but no new language elements were added to deal with referencing those objects. Instead, this role was filled by the existing C language pointers which were designed for pointing into C arrays.

The trouble caused manifests as memory leaks and dangling pointers. The concept of using pointers was blamed for this and Java was born with its garbage collection solution. The move away from C ++ to garbage collection solutions gained further momentum with C# .NET, driven by a fear of C++ and its misbehaving pointers. This drift is to a large extent inappropriate with negative effects on both the design and performance of the software produced.

The problem in C++ is not that it uses pointers, it is that it uses pointers that are insufficiently defined and constrained. For instance, it is not is the spirit of C++ that a pointer to a single object returned by the new operator should be capable of pointer arithmetic when it is abundantly clear at compile time that use of pointer arithmetic can only result in disaster.

Presented here are the missing language elements in the form of a set of smart pointers. They are not a tool-kit to be used according to judgement and preference they are to used as language elements that form closure around the new operator and they have a formal grammar of interaction. Their use as a richer pointer declarations allows completely safe management and referencing of objects returned by the ´new´ operator.

The well formed shared_ptr/weak_ptr pair

The std::shared_ptr / weak_ptr pair provide perfect closure around the 'new' operator in the world of shared ownership and the way they operate together was the example that inspired the work presented here.

The two smart pointer types have clear roles: shared_ptr<T> is an owner and by holding a reference to an object it keeps it alive. weak_ptr<T> is an observer, its reference does not keep the object alive, instead it zeroes automatically when the object is destroyed. There is also a some grammar of interaction: Only the shared_ptr can receive a new object returned by the 'new' operator, weak_ptr<T> = new T will not compile.

The shared_ptr / weak_ptr pair also have the strength of supporting the sharing of references across threads. It is natural that it should do this because the usual need for shared ownership is to provide shared access to general use server type resources. This thread-safety requirement has conditioned their design with the specific consequence that they cannot comfortably or correctly embrace the popular and useful single ownership paradigm. Although single ownership could be seen as a special case of shared ownership, strict single ownership and the thread safety requirement of shared_ptr<T> are incompatible.

To the rescue of single ownership design

There is a popular practice of using shared_ptr for every reference and reserving weak_ptr for the breaking of 'cyclic references' This is a bad practice and the fact that it produces nonsensical cyclic ownership patterns is an indication of this. Very often the function and design of the code doesn’t really require any sharing of ownership but there just wasn't such a smart set of pointers available for working with single ownership. This is what is provided here. There are some parallels with the shared_ptr / weak_ptr but shared and single ownership have different characteristics. One of those is that the single ownership smart pointers provide a much tighter grammar of interaction that help greatly with keeping code correct, coherent and readable.

Owners and observers

There already exists a range of autonomous single owner smart pointers such as auto_ptr, scoped_ptr and unique_ptr which guarantee their own safety but not the safety of any references or aliases taken from them, which can be left dangling. The missing feature is a safe observer, the equivalent of weak_ptr which zeroes automatically when its pointee is destroyed. As shared_ptr and weak_ptr were designed to work together, the same is required for single ownership. These are provided here as:

owner_ptr<T> single owner of an object that can be observed by a ref_ptr<T>

ref_ptr<T> observer or alias to an owner_ptr<T>, supports -> operator

They work like this:

C++
owner_ptr<T> apT=new T;        // apT owns the object and guarantees to deletes it.
if(apT)                    //is valid or tests as NULL
    apT->DoSomething();        //behaves just like a pointer
apT=NULL;                //deletes the object
                //deletes the object when it goes out of scope

ref_ptr<T> rT= apT;        //rT observes the same object while it lives
if(rT)                    //is valid or tests as NULL – knows when object has been deleted
    rT->DoSomething();        //behaves just like a pointer
rT=NULL;                //forgets about the object
                //does not delete the object when it goes out of scope  

There are compiler enforced rules of interaction between these smart pointers:.

Only an owner_ptr<T> can receive the raw pointer of a newly created object

Only a ref_ptr<T> can point at an owner_ptr<T>

Only a ref_ptr<T> can point at a ref_ptr<T>

Most importantly an owner_ptr<T> is not allowed to point at another owner_ptr<T> and the assignment operator ‘=’ that suggests this action is expressly prohibited between two owner_ptr<T>s. Not only does this avoid the infamous and anti-intuitive interpretation of ‘=’ as transfer of ownership (as in auto_ptr<T>) it is also an essential key to enforcing coherent rules of grammar.

Ownership can be transferred from one owner_ptr<T> to another but only explicitly by using one of the following verbs:

adopt(owner_pr<T> opT) ownership is transferred and existing observers or aliases remain valid. This most closely models raw pointers.

steal(owner_pr<T> opT) ownership is transferred and all existing observers or aliases are reset to null. The only remaining reference to the object is its new owner.

These verbs are available as dot methods of the receiving owner_pr<T> and also as global functions so the following are equivalent.
C++
op2. adopt(op1);    //slightly more efficient
op2 = adopt(op1);    //sometimes more convenient    

The grammatical power of owner and observer declarations

Code that deviates from these rules will not compile. The effect of this compiler enforced grammar of interaction is surprisingly powerful. Code becomes illuminated by clear declarations of owner or observer status. Even if your understanding of the roles of owner and observer is a bit vague, it will quickly show you the way.

It works like this:

When you use the 'new' operator you have to declare a owner_ptr<T> to receive its return

C++
owner_ptr<T> apT=new T;     

If you try to use ref_ptr<T> instead, you will get a compiler error. A ref_ptr<T> can't own an object.

If you want to take an alias to this object to use elsewhere you have to declare it as a ref_ptr<T>

C++
ref_ptr<T> rT =   apT;      

If you try to use owner_ptr<T> instead, you will get a compiler error. You can't have two exclusive owners of the same object

You may take an alais from an alias

C++
ref_ptr<T> r2T =   rT;   

but if you point an owner_ptr<T> at an alias ref_ptr<T>, you will get a compiler error. The object, if it exists, already has an owner.

Staying single in collections – the 'in_collection' modifier

Both owner_ptr<T> and ref_ptr<T> can be used as elements of collections, including those of the STL.

However owner_ptr<T> must be declared with the “in_collection” modifier as in the following declaration.

C++
vector< owner_ptr<T, in_collection> > v;    

If you use owner_ptr<T> without the “in_collection” modifier in a collection that performs any kind of element copying, as most do, you will get a compiler error. For a single owner, there are some important differences between life as an element of a collection and life as a named variable. The “in_collection” modifier accounts for these differences without compromising the deterministic destruction guarantee of single ownership. An owner_ptr<T, in_collection> interacts with ref_ptr<T> in exactly the same way as a normal owner_ptr<T>.

Warning: Do not use the = operator to transfer ownership between collections of owner_ptr<T, in collection> as in:

C++
ArrayOfOwners1[i] = ArrayOfOwners2[i] ; // unexpected behaviour 

The result is not undefined nor will it provoke memory leaks or dangling pointers but it will not be what you are expecting and is unlikely to be useful. It is was not possible to arrange for a compiler error to prevent this from being done.

Instead follow the rules that the compiler is able to enforce with individual named owners and use the 'adopt' and 'steal' verbs instead:

C++
ArrayOfOwners1[i].adopt( ArrayOfOwners2[i]) ; //perfectly safe
//and
ArrayOfOwners1[i].steal( ArrayOfOwners2[i]) ; //perfectly safe

are perfectly safe.

Allowing the infamous destructive copy for special cases – the 'returnable' modifier

The tradition of interpreting ‘=’ as transfer of ownership, also known as destructive copy, was not without reason. It provided a way of transferring ownership and it allowed an owner to be returned from a function.

owner_ptr<T> deliberately forces transfer of ownership to be explicit and visible using the adopt and steal verbs but if you need to return an owner_ptr<T> from a function, then you will need the implicit ‘=’ as transfer of ownership interpretation that owner_ptr<T> specifically prohibits.

For special cases such as class factories and object initialisation routines where there is a need to pass ownership out as the return value, owner_ptr<T> can be declared with the “returnable” modifier as in the example below.

C++
owner_ptr<T, returnable> NewObject()
{
    owner_ptr<T, returnable> pObj= new CObject;
    //object initialisation
    return  pObj;
}  

The “returnable” modifier enables ‘=’ as transfer of ownership. When passing the return value out of a function, destructive copy has no negative consequences because everything within the function will be destroyed anyway as it goes out of scope.

owner_ptr<T, returnable> has a slightly different grammar of interaction:

It allows implicit transfer of ownership using the = operator:

between one owner_ptr<T, returnable> and another

and from owner_ptr<T, returnable> to any owner_ptr<T> without the “returnable” modifier.

It can only be declared as a local variable.

The 'this' pointer and safe references to objects declared by value

ref_ptr<T> is a safe reference to an object that is already owned but it can be owned in ways other than by an owner_ptr<T> An object may be declared by value and therefore owned by the scope of a function or a class or you may want to take a reference from within the class using the 'this' pointer in which case you may not know how it is owned.

A base class is provided that can be added to the inheritance list of any class
class gives_ref_ptr<T>
it will modify any class so that it provides safe access to the 'this' pointer by providing a
ref_ptr<T> ref_ptr_to_this() method.
This is analogous to the class enable_shared_ptr_to_this<T> of the std::shared_ptr/weak_ptr library but the fact that it returns a ref_ptr<T>, rather than a shared_ptr<T> is a significant difference.
There are no limitations on how a class that inherits from class gives_ref_ptr<T> can be be declared or owned (other than that it must not be vulnerable to the actions of another thread) and the public method ref_ptr_to_this() will always give you a safe reference to it.

Specifically this enables you to get a ref_ptr<T> to an object declared by value:

C++
class CMyObject : public gives_ref_ptr<CMyObject>
{
 
};
 
MainFunc
{
    CMyObject Object;
    ref_ptr<CMyObject> rMyObject=MyObject.ref_ptr_to_this().
 
    //0r

    FunctionThatTakesRefPtr(MyObject.ref_ptr_to_this());
}   

This deals with cases where you would otherwise use the address of operator & to get a pointer to the object declared by value or design the function to take a C++ reference instead.

Unlike pointers and C++ references, ref_ptr_to_this() gives a ref_ptr<T> that is completely safe under all circumstances.
In some cases you may want to take a safe reference of a variable declared by value and you cannot control the class definition. In these cases you can use the referencable_value<T> superclass:

if you have:

C++
CLibraryClass m_LibClass; 

and you want to be able to hold a ref_ptr to m_LibClass then you can declare it as:

C++
referencable_value<CLibraryClass> m_LibClass;  

This will allow

C++
ref_ptr<CLibraryClass> rLibClass= LibClass.ref_ptr_to_this(); 

This will not change the behaviour of m_LibClass in any other way so there is no problem with making this adjustment retrospectively.

Getting work done fast – the 'fast' modifier for ref_ptr<T>

There comes a point where (unless the object is controlled by another thread) you know that one test is enough before doing several dereferences – or you think you know.

So you will want to write:

C++
if(r)
{
    r->DoSomething();
    r->DoThis();
    r->DoThat();
}  

ref_ptr<T> is designed to not fall into undefined behaviour even if you decide to omit testing it, therefore it does its own test on each dereference so that it can issue a defined response if the dereference is going to fail. This of course defeats the intention of omitting unnecessary checks to get work done fast.

For this reason a variant of ref_ptr<T> is provided in the form of ref_ptr<T, fast>. It is provided for fast execution of small blocks of code. Rather than checking each dereference, it puts a lock on the object and then dereferences without any further checking.

C++
ref_ptr<T, fast> fr=r;
if(fr)
{
    r->DoSomething();
    r->DoThis();
    r->DoThat();
}  

The effect of the lock depends on the ownership of the object. If it is shared then it simply shares it and keeps it alive. If it is singly owned then this is not possible. In the case of a singly owned pointer being deleted while it is locked by a fast pointer the response is:

In Debug builds: Throw an exception to alert the programmer to a serious error.

In Release builds: Avert imminent disaster by incorrectly keeping the object alive until the fast pointer has finished with it.

Programmers make the decision to test once and dereference several times everyday. ref_ptr<T, fast> gives you the opportunity to declare that decision and receive a performance gain. It also ensures that if you got it wrong then you will receive an exception indicating where the error occurred rather than undefined behaviour. Should the error escape detection during development, the release version will do its best to cover it up by bending ownership rules thus running the risk of disrupting destruction sequences rather than crashing immediately.

It should be understood that if the object is not controlled by another thread then there is no possibility of an object being deleted between one line of code and another. The only danger is that an operation in the same code block as the fast pointer inadvertently and indirectly deletes the object. This is rare but it can happen:

A global owner_ptr<T> that owns the object is zeroed.

A back pointer to the parent of the passed in object is used to access its owner and zero it.

There is a call to PeekMessage which enables all sorts of code to execute which may result in destruction of the object.

ref_ptr<T, fast> has a grammar specific to its role.

It can only be declared as a local variable within a function or code block.

Like a const, it must be initialised to its value on construction and its value cannot be changed after that.

A ref_ptr<T, fast> can point at any other kind of pointer – a destination for all!

No other kind of pointer can point at a ref_ptr<T, fast> - the origin of none!

Functions that take smart pointer arguments

Rich pointer declarations introduce a new dimension to passing pointers into functions which is not apparent when you use raw pointers. The argument definition not only specifies the type of the pointed to object but also the role of the pointer. It is the role in the argument definition which determines what is passed into a function even if the pointer presented as parameter has a different role.

The effect is that we can declare a function to take a ref_ptr<T>

C++
bool CheckThisRef(ref_ptr<T> rT)
{
    return rT->CheckMe();
}  

And then pass it an owner_ptr<T>

C++
owner_ptr<T> opT=new T;
 
bool bRes=CheckThisRef(opT); 

The owner_ptr<T> is not passed into the function and its ownership is not affected. Instead a temporary ref_ptr<T> is automatically created to point at it and this is passed into the function.

A very important consideration is the need for a universal role declaration to use for the pointer arguments of general purpose functions.

If the function is for generic use in any situation then the perfect choice is ref_ptr<T, fast>. A ref_ptr<T, fast> can point at any other pointer type, including std::shared<T> and std::weak_ptr<T>, or even a raw pointer. This means that if ref_ptr<T, fast> is used as the argument type for a function then that function can receive any pointer role as a parameter. It will also provide the fastest possible execution within the function and maintain the safety guarantees of the passed in pointer.

ref_ptr<T, fast> can be used as the argument to a function as long as the function doesn’t try to delete the object it points at or try to store a reference to it for later use. If the function does either of these things then it is not truly generic and it will not compile with ref_ptr<T, fast>, it is a function that is linked to infrastructure and ref_ptr<T> should be used as an argument instead.

Functions that take raw pointer arguments

If you have to pass an owner_ptr<T> or ref_ptr<T> into an API function that takes a raw pointer argument, it will not automatically convert to a raw pointer. You must use its get_pointer() dot method.

C++
void ApiFunc(T* pT);
 
owner_ptr<T> apT= new T;
 
ApiFunc(apT); //will not compile

ApiFunc(apT.get_pointer()); //ok  

The get_pointer() method is intentionally verbose because giving direct access to the raw pointer endangers the integrity of the pointer system. It keeps potentially harmful breaches of integrity visible.

Passing smart pointers by reference

These smart pointers may be passed into functions by reference and will have the same effects, advantages and limitations as passing anything else by reference. Passing by reference guarantees that you are dealing with the original smart pointer passed and not a copy or temporary reference to it. It also guarantees that no copying is performed and therefore no adjustment of reference counting objects needs to be done. This can be a considerable saving when passing a ref_ptr<T, fast> through a stack of calls but at the beginning of the chain there has to a function that receives a ref_ptr<T, fast> by value in order to create it by conversion from the original longer living owner_ptr<T> or ref_ptr<T> that it represents.
The most important consequence to be aware of in passing by reference is the effect on owner_ptr<T>.

e.g.:

C++
void DoStuffToThisOwner(owner_ptr<T>&  rT) //owner passed by reference
{
     rT=NULL;    
}  

And then pass it an owner_ptr<T>

C++
owner_ptr<T> opT=new T;
 
bool bRes=DoStuffToThisOwner(opT);  

In this case DoStuffToThisOwner recieves a reference to the original owner_ptr<T> and is capable of resetting it. In some cases this may be what you want but most of the time you don't and

C++
void DontDoStuffToThisOwner(ref_ptr<T>  rT) //must be by value to carry out conversion 

makes more sense.

There are plenty of other unintended consequences that may occur in passing by reference but mostly they will result in compiler errors that will allow things to be corrected .

As a general rule: pass smart pointers by reference through long chains of calls but begin those chains passing by value.

Casting smart pointers

These smart pointers implicitly carry out casting from derived to base class:

C++
class T
{
 <span class="Apple-tab-span" style="white-space: pre;">    </span>//.......
};
class U :public T
{
 <span class="Apple-tab-span" style="white-space: pre;">    </span>//.......
}
 
owner_ptr<U> apU=new U;
ref_ptr<T> rT = apU;    //ok, implicit cast 

and also on initialisation:

C++
owner_ptr<T> apT=new U;  //ok  

but NEVER put a cast between the return from 'new' and the assignment to an owner_ptr

C++
owner_ptr<T> apT=(T*)new U; //NEVER do this   

it will prevent the owner ptr_from knowing the size of the original object.

As with raw pointers, it will not implicitly cast from base to derived class, you must provide an explicit cast because only you know that it is the appropriate and correct thing to do. To do this you must use the ptr_cast<U>(any_ptr<T>) function.

C++
owner_ptr<T> apT = new U;
 
ref_ptr<U> rU = ptr_cast<U>(apT)    

The null pointer – NULL or null_ptr

NULL, a numerical value of zero, is not the best expression of a null pointer. It can be mistaken for an integer value argument. For this reason a type safe null_ptr is provided that is not a number. You may use NULL or null_ptr and you may mix them but null_ptr is recommended.

Summary of the pointer system


Single owners

owner_ptr<T>

variant for collections

owner_ptr<T, in_collection>

variant for class factories

owner_ptr<T, returnable>

special verbs for transfer of ownership:

<code>adopt(owner_ptr<T>)

existing observers remain valid

steal(owner_ptr<T>)

zeroes all existing references

Observers

ref_ptr<T>

variant for fast execution

ref_ptr<T, fast>

add on base class for object

class gives_ref_ptr<T>

providing ref_ptr<T> ref_ptr_to_this()

super class

referencable_value<T> var;

providing ref_ptr<T> var.ref_ptr_to_this()

All

any_ptr<T>.get_pointer() access to internal pointer – intentionally verbose

ptr_cast<T>(<code>any_ptr<U>) explicit cast to derived class

null_ptr alternative to NULL with greater integrity



Interaction with std::shared_ptr and std::weak_ptr

These new smart pointers coexist with and compliment the std::shared_ptr and weak_ptr and conversions are provided where there is a correct and supportable interpretation.

A ref_ptr<T, fast> can point at a shared_ptr<T>

A ref_ptr<T, fast> can point at a weak_ptr<T>

A shared_ptr<T> can steal

an owner_ptr<T>

an owner_ptr<T, in_collection>

and an owner_ptr<T, returnable>

‘C’ style arrays

The smart pointers presented here provide closure around the `new' operator as in 'new T' they should never be used with placement new as in new T[array_size]. It would be possible to design smart pointers to use with placement new but to be safe it would need to add bounds checking to its overhead. As placement new is usually used to avoid overhead it would probably be a pointless exercise.

If you need an array then in general use a collection class such as std::vector<T> or CAtlArray<T>. It is more convenient, easier to read and safer.

If you really need the raw performance or simplicity of a 'C' style array using new T[array_size]. Then do so using raw pointers but encapsulate it and present the encapsulation as a single object to the rest of your code.

Bjarne Stroustrop the author of C++ has some comments on 'C' style arrays http://www2.research.att.com/~bs/bs_faq2.html#arrays

What do you need to use these smart pointers ?

All you need to start using these smart pointers is the file

xnr_ptrs.h

and you enter into its use with:
C++
using namespace xnr;
 
owner_ptr<MyClass> apMyClass = new MyClass;      // apMyClass is now safe
ref_ptr<MyClass> rMyClass = apMyClass        //rMyClass is also safe 

or in the case of collections with

C++
AtlArray<owner_ptr<MyClass, in_collection> > MyClassArray;;
MyClassArray.Add(new MyClass);
ref_ptr<MyClass> rMyClass =  MyClass[0];  

Once you are making use of it you may want to consider the enhancement of a reference counting object pool which will prevent fragmentation of memory by residual reference counting objects. This is available on MS Windows platforms by including the file ..

win_refcount_pool.h //Not difficult to modify to work with other OSs

above xnr_ptrs.h

and place the macro XNR_REFCOUNT_POOL_INSTANCE at global scope and below these includes in just one of your .cpp files.


More experienced programmers may consider working without the expensive guarantee to delete the complete object that this system provides and instead ensure that all base classes used in ownership have their destructors marked virtual. This will reduce the memory used by the smart pointers, especially in polymorphic collections. To do this place

#define XNR_LEAN_PTR_ENGINE

above xnr_ptrs.h


If you are working on a system that does not support C++ exceptions or exit(EXIT_FAILURE) then you will need to override the in-built exception handling. To do this include a file with the following above xnr_pntrs.h:

C++
namespace xnr{
namespace eng{
 
void xnr_fatal_error(char * csReason)        //it is over – stop the program
{
    //Your choice of what action to take
}
void xnr_exception_error(char * csReason)    //catch and recovery may be feasable
{
    //Your choice of what action to take
}
 
} //namespace eng
} //namespace xnr

#define XNR_ERRORHANDLER_DEFINED    

Can they be misused?

Smart pointers are never idiot proof. They are intended to provide an automated safeguard to replace coding that requires unreasonable dilligence to write and maintain correctly.

On the plus side, once you assign a new object directly to an owner_ptr<T>, that object will never be leaked and all references you can take from it are gauranteed to be valid or null. The only thing that will break this safety is:

not assigning the new object directly to an owner_ptr<T> correctly

C++
T* pT = new T;        //pT can be used to delete the object independantly
owner_prt<T> apT = pT;    //apT may not be safe   
owner_prt<T> apT =(T*)new U;    // apT never sees the original class U  
exposing the raw pointer with the get_pointer() method.
C++
owner_prt<T> apT = new T;    //safe so far
T* pT =  apT.get_pointer();    //pT can be now used to delete the object independantly 
You may need to use the get_pointer() method to pass objects into legacy or API functions. When you do, you should ensure that the function does not delete the object, which will break your code, or store it for later use which may break its code. The verbose get_pointer() method stands as a testament that you have exposed your smart pointer to this potential risk.

Most other misuses will be detected and reported as an error by the compiler but there are some that the compilier will not detect. None of these will violate memory or pointer safety and their efect is defined but probably not what the programmer might be expecting. In each case it is questionable what the programmer would have been intending. They are:

C++
vector<owner_ptr<T,  returnable> > //This will compile with disastrous results, like auto_ptr 
owner_ptr<T, in_collection> apT = new T; // apT is not in a collection, it is a named variable 
vector<owner_ptr<T,  in_collection> > Array1;
vector<owner_ptr<T,  in_collection> > Array2;
// Fill Array1...
Array2[i] = Array1[i];    //This will not transfer ownership, it is closer to sharing it. 
The last one does unfortunately allow and undermine a popular (though now depeciated) programming paradigm, examples of which may be found in existing code.

How it works

It is a reference counted smart pointer system. Every smart pointer holds two pointer values on creation. A pointer to the object, initially null, and a pointer to a reference counting object, also initially null as no reference counting object yet exists.

The reference counting objects and their behaviour are different according to which pointer engine is used.

In the case of the default (or safe) pointer engine:

The reference counting objects consists of:

a strong count – a count of owners

a weak count - a count of observers

a pointer to the original object – to ensure its complete destruction

a virtual function – to allow operations on the original object according to its type.

And they are created:

the first time that an alias is taken

C++
owner_ptr<T> apT= new T;      //still no reference counting object
ref_ptr<T> rT =  apT         //reference counting object created  

and any time an owner is inplicitly cast to base class

C++
owner_ptr<T> apT= new U;    //reference counting object created 

In the case of the lean pointer engine:

The reference counting objects consists of:

a strong count – a count of owners

a weak count - a count of observers

And they are created:

the first time that an alias is taken

C++
owner_ptr<T> apT= new T;      //still no reference counting object 
ref_ptr<T> rT =  apT         //reference counting object created   

The lean pointer engine has smaller reference counting objects and less circumstances create them but it does not have the gaurantee to destroy the complete object. So if you use it you must be dilligent and correct in the use of virtual destructors.

In both cases, the pointed to object is destroyed when the strong count falls to zero and the reference counting object destroys itself when its weak count falls to zero.

The pointer engines define the base class hierarchy

_has_ref_counts – base class for anything that is associated with a ref. counting object

data member: reference_controller* m_pRC; //pointer to ref. counting object

methods to create and operate on ref. counting object

also contains definition of ref. counting object and its methods

_ptr : protected _has_ref_counts – base class for all smart pointers

extra data member: T* m_pT; //pointer to object

provides a set of verb like primitive methods that the final smart pointer definitions can select from to use as appropriate.

_value_with_counts : protected _has_ref_counts – base class providing protected methods for the gives_ref_ptr public add on base class.

no extra data members

The default (safe) pointer engine is found in xnr_ptrs.h and for the lean pointer engine in xnr_lean_ptr_engine.h

The remaining code in xnr_ptrs.h is independent of the pointer engine chosen:

First of all the carefully crafted constructor and assignment code patterns that are repeated throughout the smart pointer definitions but with small variations are encapsulated in parametrised macros. This allows much of the the final smart pointers definitions to be reduced to a more readable list of allowed conversions and verbs that execute when they occur.

Finally the smart pointers themselves are defined using the parametrised macros.

Most of the code does what you would logically expect but there are a few unusual turns:


Although the system is focused on single ownership, an integer strong count is used rather than simply a boolean to represent if the object exists or not. There is no loss as a boolean occupies as much space as an integer and it is there because it allows:

a lock by a ref_ptr<T. fast> to be made by turning the strong count negative

the owner_ptr<T, in_collection> to have an internal strong count that allows it to survive being copied within collections.


The in_collection modifier requires some explanation. Many collection classes make temporary internal copies of their elements which invoke the = operator. This is prohibited by the normal owner_ptr<T> and will result in a compiler error. Single ownership pointers such as auto_ptr<T> that support the = operator and interpret it as destructive copy will compile but with disastrous results. The first time an element is copied, ownership is transferred to the copy and when the copy is destroyed the object will be deleted. For this reason it is widely believed that only shared ownership pointers such as std::shared_ptr<T> can be held in such collections. The problem with this is that is that you are forced into shared ownership model as soon as you use collections - and there is nothing logical about that.

The in_collection modifier does two things that modify an owner_ptr<T>:

the = operator is provided and interpreted as sharing ownership, it increments the strong count.

On destruction it does not directly destroy the object, instead in decrements the strong count and only destroys the object when it reaches zero.

This modification gives it the properties of shared ownership that allow it to survive in a collection but it retains deterministic destruction because zeroing or resetting any element will immediately set the strong count to zero (regardless of its previous value) and delete the object. All other references to the object are effectively zeroed.

There is one loophole with owner_prt<T, in_collection >. The = operator is provided so that the collection can carry out its internal copies but there is nothing to stop it from being used between two collections with unintended results, as in:

C++
Array1[i] = Array2[i];   //does not have intended effect   

This will result in the object being referenced by both arrays but its immediate destruction will occur as soon as it is zeroed in either array or is removed from both of them. There may be cases where this is the desirable behaviour but it is not what most people would expect. It would be better to prohibit anything that appears to make two owner pointers equal in visible code but in this case it was not possible. This construct is prohibited by the compiler for named owner_ptr<T>s. Ïn the case of owner_ptr<T, in_collection>, the programmer must take on the discipline of avoiding it.


The ref_ptr<T, fast> carries an extra data member, a pointer to a function that allows immediate execution of the chosen dereference mechanism without testing any conditions. This pointer to function is also used to flag the conditions under which the ref_ptr<T, fast> is operating. For this reason four methods are provided that the function pointer can point at even though three do exactly the same thing. This is simply to allow four states to be flagged even though only two mechanisms are required. The two mechanisms are:

If created valid and non-zero then dereference immediately

If created zero then throw an exception

No condition is evaluated on dereference, the function pointer is set on creation according to the locking mechanism..


The following primitive methods of the _ptr class require explanation:

_reset() //decrements strong count and if zero, deletes object

_quiet_reset() //decrements strong count and never deletes object

_hard_reset() //sets strong count to zero and deletes object

_quiet_hard_reset() //sets strong count to zero and never deletes object

'quiet' means don't delete the object

'hard' means set strong count directly to zero


The desirability of using a reference counting object pool requires explanation.

You will be looking at this smart pointer system and may have played with the std:shared_ptr/weal_ptr pair because you want to systematically avoid the possibility of an invalid pointer being accessed. This is because ensuring that whenever you delete an object, that you also set all alias pointers to it to null is a laborious and uncertain task resulting in untidy code that is easily broken by the addition of any new aliases.

You may just want to ensure that any omissions in your alias zeroing code don't cause problems or you might want take the logical step of not writing that code at all and instead having it all done automatically. Either way you are anticipating that the explicit zeroing of aliases in your code may be less than complete.

Now lets look at what not explicitly zeroing aliases does:

Using unprotected raw pointers:

invalid reference can be used with undefined and catastrophic results.

Using a garbage collector or simulating it with std::shared_ptr:

The object referenced will be retained in memory and the pointer will give access to valid data or valid code though that may not be the intention of the programmer.

Using shared_ptr<T> for owners and weak_ptr<T> for observers or the single ownership smart pointer system presented here:

A small reference counting object will be retained in memory to indicate to all aliases that the object no longer exists. It will remain until all aliases have been explicitly zeroed, tested and found null or have fallen out of scope.

To summarize: The effects of incomplete explicit zeroing of aliases are:

Raw pointers – unpredictable crashes

Garbage collector – large objects unintentionally retained in memory allowing unintended accesses.

Owner & observer smart pointers – small reference counting objects retained in memory, no unintended accesses possible.

The reference counting objects are small but they will sit there until all references to them disappear, just as larger objects do with a garbage collector. Even if you abandon explicit zeroing of aliases completely resulting in a lot of persistence of reference counting objects, the total amount of memory used is not a problem. The problem is that if the reference counting objects are each created on the heap with 'new' operator then they will tend to get interspersed in memory among the objects that they are associated with. When those objects are deleted, the reference counting objects will remain, fragmenting the memory that has been released.

Now a great deal of shared_ptr code has been written without worrying about this problem – memory is cheap and plentiful but hey, a lot of C++ code is well structured to allow the return of clean blocks of memory, so lets not spoil it with our smart pointer system!

The reference counting object pool pre allocates space for all reference counting objects in successive blocks with a mildly exponential grow rate. This ensures that they are held together in blocks of memory rather than being scattered everywhere, thus minimising fragmentation. Allocation from the pool is also faster than allocating each one from the heap. When the pool is activated, the reference counting objects have their 'new' and 'delete' operator overridden so that they work directly with the pre-allocated pool rather than with the heap.


Recommendation

It has been supporting some complex and growing code for some years now and has worked well. That does not mean that every possible corner case has been tested but if you can find a fault, I´m sure we can either fix it or define it as a limitation of its scope.

Personally I would not like to have to work without it. Please try it out and build with it!

Motto

Functional and tidy overhead is good engineering. Bad engineering is allowing a mess to be made.

History

This work is derirative of and superceeds XONOR pointers: eXclusive Ownership & Non Owning Reference pointers previously published on TheCode Project by the same author.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Retired
Spain Spain
Software Author with engineering, science and mathematical background.

Many years using C++ to develop responsive visualisations of fine grained dynamic information largely in the fields of public transport and supply logistics. Currently interested in what can be done to make the use of C++ cleaner, safer, and more comfortable.

Comments and Discussions

 
GeneralNice try, but doesn't make C++ a modern language Pin
Member 1075006522-Apr-14 23:36
Member 1075006522-Apr-14 23:36 
GeneralRe: Nice try, but doesn't make C++ a modern language Pin
rxantos17-Jun-14 16:06
rxantos17-Jun-14 16:06 
GeneralMy vote of 5 Pin
User 4041118-Nov-12 22:39
User 4041118-Nov-12 22:39 
GeneralWell done Pin
Espen Harlinn14-Nov-12 5:22
professionalEspen Harlinn14-Nov-12 5:22 
It's definitely a well written article Big Grin | :-D
Espen Harlinn
Principal Architect, Software - Goodtech Projects & Services AS

Projects promoting programming in "natural language" are intrinsically doomed to fail. Edsger W.Dijkstra

QuestionR-value references Pin
John Bandela13-Nov-12 14:53
John Bandela13-Nov-12 14:53 
AnswerRe: R-value references Pin
john morrison leon14-Nov-12 6:09
john morrison leon14-Nov-12 6:09 
GeneralRe: R-value references Pin
Jim Barry20-Nov-12 5:53
Jim Barry20-Nov-12 5:53 
GeneralRe: R-value references Pin
john morrison leon20-Nov-12 23:30
john morrison leon20-Nov-12 23:30 
GeneralRe: R-value references Pin
Jim Barry21-Nov-12 13:53
Jim Barry21-Nov-12 13:53 
GeneralRe: R-value references Pin
john morrison leon22-Nov-12 0:56
john morrison leon22-Nov-12 0:56 
GeneralRe: R-value references Pin
Jim Barry22-Nov-12 14:51
Jim Barry22-Nov-12 14:51 
GeneralMulti-threading Pin
john morrison leon23-Nov-12 0:58
john morrison leon23-Nov-12 0:58 
GeneralRe: Multi-threading Pin
Jim Barry24-Nov-12 10:00
Jim Barry24-Nov-12 10:00 
GeneralRe: Multi-threading Pin
john morrison leon24-Nov-12 14:09
john morrison leon24-Nov-12 14:09 
GeneralRe: Multi-threading Pin
Jim Barry25-Nov-12 14:37
Jim Barry25-Nov-12 14:37 
GeneralRe: Multi-threading Pin
john morrison leon25-Nov-12 22:14
john morrison leon25-Nov-12 22:14 
GeneralRe: Multi-threading Pin
Jim Barry26-Nov-12 13:10
Jim Barry26-Nov-12 13:10 
GeneralRe: Multi-threading - thank you Pin
john morrison leon26-Nov-12 13:53
john morrison leon26-Nov-12 13:53 
GeneralThanks for info. on compile problem Pin
john morrison leon23-Nov-12 4:38
john morrison leon23-Nov-12 4:38 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.