This document describes the code in the file 'main.cpp', which is provided as an example of an idea for automating memory management that can solve the problem of cyclic references.
Boost contains nice memory management classes: shared_ptr, weak_ptr, intrusive_ptr. Memory management using these classes is very nice, but reference counting has one flaw: it can not handle cyclic references.
In order to handle cyclic references with reference counting, the programmer is left with the task to break cycles using weak pointers. Such a task may not scale well in time and space: as a project becomes larger and more complicated, cycles are harder to spot, especially when cycles are introduced via subclasses.
The problem gets worse when a project evolves over a long period of time, with different programmers assigned to a project. During transition from one person to another, knowledge of cycles may be lost.
The code in the file 'main.cpp' implements a solution for automating memory management, borrowing some ideas from true garbage collectors. The solution can handle referential cycles and also delete objects in a good order (finalization order is a well known problem for garbage collectors) when the objects are no longer referenced anywhere from the program.
The presented solution is not a panacea that solves all the aspects of memory management. At best, it can be thought of as complementary to existing methods.
HOW GARBAGE COLLECTORS WORK
In true garbage collectors, the collector scans the object graph, starting from the root set, marking each visited object as reachable. Objects left untouched at the end of the collection are deemed unreachable and therefore collected. For example, if we have 3 root references R1, R2, and R3, and objects A, B and C:
R1 ---> A ---> B
R3 ---> C
If the reference R3 is removed, then the object graph becomes:
R1 ---> A ---> B
The collection algorithm execution goes like this:
1) from R1, A is reachable.
2) mark A as reachable.
3) scan A.
4) from A, B is reachable.
5) mark B as reachable.
6) scan B.
7) from R2, B is reachable.
8) B is already marked, do not mark it again.
9) A, B is reachable; C is not reachable.
10) collect C.
PROBLEMS IN C++ USING THE TRADITIONAL APPROACH
In order for this to work, one must know the root set. Unfortunately, in C++ the root set is not known (compiler assistance is required for this), so this algorithm can not be applied.
One solution would be to register root pointers to a ptr collection, either explicitly or implicitly in the constructor of a special pointer class. But that does not work with stl containers, because stl containers themselves must use the specified pointer classes. Unfortunately, this is not always true. For example:
|--> R1 ---> A ---> B
R3 ---> C
But when we have an STL container:
|--> R1 ---> A ---> B
| | |
|--> R2 --------|
|--> R3 ---> C
|--> R4 ----> STL CONTAINER
| ~~> D
| ~~> E
In the above diagram, the arrow '~~>' represents a non-garbage collected link (a shared_ptr perhaps). When the collection runs:
1) from the root ptr collection, the reference R4 is reached.
2) from R4, the STL CONTAINER is reached.
3) C and D are not reachable.
4) collect C and D.
So, C and D will be collected, even though there is an STL container that refers to these objects.
ANOTHER APPROACH THAT SOLVES THE PROBLEM
In the traditional approach:
An object is collectable when it is no longer accessible by any path that starts from the root set.
The above sentence can be reversed without altering the essence of the collection algorithm:
An object is collectable when it is no longer accessible by any path that ends to the root set.
Using this reversed approach, a set of smart ptr classes can be created which can be used to check if there is any path to a root ptr. The original example with the three references R1, R2, R3 and objects A, B and C becomes:
R1 <--- A <--- B
R3 <--- C
When a reference is removed, we can check if there is another path that leads to a root reference. If there is no root reference that points to the object, then the object can be safely deleted. For example, if the reference R3 is removed, our graph becomes:
R1 <--- A <--- B
And now the object C can be safely deleted.
DESIGN AND IMPLEMENTATION
In the file 'main.cpp', the following classes are defined:
1) class 'object', as a base class that contains a list of pointers that point to it. This is necessary so as that the object graph can be
traversed backwards towards the roots.
The object class contains a flag 'm_lock' which is used to prevent infinite loops during checking of an object for collectability.
2) class 'ptr_base', which is a generic ptr class that encapsulates the logic of acquiring and releasing an object:
a) The operation 'acquire' adds the pointer to the list of pointers of the pointed object.
b) the operation 'release' removes the pointer from the list of pointers of the previously pointed object. It then calls the object to dispose itself, if it is not collectable. This is where the graph is scanned backwards towards the roots.
The class 'ptr_base' is also an intrusive list node (using boost::intrusive) for performance reasons, since a pointer can only point to a single object at a time.
The class 'ptr_base' contains a pointer to an owner object, set during construction of a ptr instance. This pointer is used for knowing when a ptr is root: if a pointer is declared without an owner, then it is a root ptr.
3) a class 'ptr<T>', which is a simple derived class that adds type capabilities to class 'ptr_base'. The class has normal pointer semantics, as well as stl pointer semantics (functions 'get()' and 'reset()').
When an object is disposed, it is checked if it is collectable or not: the graph of objects is traversed backwards to the roots. If the object is collectable, it is deleted.
Deleting the objects causes the member pointers of an object to release their own objects recursively, in natural order (i.e. in c++ destruction order).
HOW DOES IT SOLVE THE PROBLEM OF CYCLES ?
If there is a cycle, then the dispose function makes sure that all the pointers pointing to the disposed object are released before the object is deleted, thus resolving the cycle.
HOW DOES IT WORK WITH STL CONTAINERS ?
STL containers contain root references, i.e. instances of ptr<T> that do not have an owner. Therefore, as long as an object is referenced from an STL container, it will not be collected, because it will be accessible from the root set of the STL container.
The STL container itself can be managed via shared ptrs. The example provided in 'main.cpp' contains an example with an std::list.
EAGER VS LAZY DISPOSAL
Trying to dispose an object each time a pointer changes value may be a very slow operation, if the object graph is too complex. A solution to this problem is to postpone the checking for a later date, i.e. to have lazy disposal.
The classes 'collector' and 'collector_object' work together to provide the functionality for lazy disposal.
The file 'main.cpp' contains two tests:
1) a test with trees (nodes pointing to parent, previous and next siblings, first and last children) with eager disposal.
2) the same test with lazy disposal.
In both cases, all objects are collected as required.
The test has been executed in Visual Studio 2005.