Click here to Skip to main content
Click here to Skip to main content

AGM::LibGC: a C++ garbage collection library

By , 30 Jul 2004
 

Introduction

LibGC is a very small library (around 500 lines of code) that gives garbage collection capabilities to C++. It is developed, tested, and used under MSVC++ 6.0.

The usual solution for memory management in the C++ environment (aside from using garbage collectors, like the Hans - Boehm one) is reference counting. But reference counting has some fundamental problems:

  1. cyclic references
  2. slow speed of execution
  3. difficulty in programming

The problem no. 1) is due to having a counter that counts how many other objects point to a certain object; the counter of the object must reach 0 in order for the object to be released. Unfortunately, when one object A points to another object B and that object B points back to A, these two objects will never be released since there is a cycle between them.

The problem no. 2) is due to the way reference counting works. Within every assignment, the managed pointer class has to do the following steps:

  1. lock resources (if the solution is multithreaded)
  2. compare current pointer with new one; if there is no change, do nothing
  3. store the current pointer value in a temporary variable
  4. copy the value of the parameter pointer to the pointer member
  5. increase the reference count of the new object
  6. decrease the reference count of the previous object
  7. delete the previous object, if the reference count reaches 0
  8. unlock resources (if the solution is multithreaded)

All the above operations take place in every assignment; they are very expensive and can make a program very slow, especially if the program has thousands of pointers (for example, on the stack).

The problem no. 3) is due to the way reference counting works. Special care should be taken in handcrafting destructors in order to avoid multiple deletions of the same memory block, a task that is quite complicated. Many programmers tend to make wrapper classes around shared objects in order to avoid this problem, but then they have to manage two classes with the same interface: the wrapper class and the internal class.

The solution offered by LibGC has none of these disadvantages. Objects that are referenced in cycles are collected and deleted normally, the library is generally quite fast in doing garbage collection, and there is no need for wrapper classes.

LibGC uses the conservative mark-and-sweep stop-the-world algorithm to collect the garbage data. The programmer writes C++ as usual, with pointers and such, but objects need not be deleted. Object deletions work as usual.

The license of LibGC is the LGPL.

Using LibGC is very simple; just program normally as you always did:

  • allocate objects with operator new.
  • allocate objects that need finalization with macro GC_NEW.
  • inherit your classes from class Object in order to be automatically finalized without the usage of GC_NEW.
  • if you need to delete objects, delete them with operator delete.
  • declare objects on the stack.
  • declare objects globally (in the program's data section).
  • manually do garbage collection by calling the function doGC().

Deletions work normally: you can delete an object anytime, and the memory occupied by the object is no longer garbage-collected.

When the application exits, all objects are finalized, and memory is freed.

The class Object is in the namespace agm::gc.

Example:

//use garbage collection
#include "gc.hpp"
using namespace agm::gc;

//include some STL class for demonstrating the global use of operator 'new'
#include <list>
using namespace std;

//a class that is not automatically finalized
class Foo {
public:
};

//a class that is automatically finalized
class Bar : public Object {
public:
};

//main
int main()
{
    //allocate a Foo object with macro GC_NEW 
    //because it is not automatically finalized
    Foo *obj1 = GC_NEW(Foo)();

    //allocate an STL list that is not automatically finalized;
    //nodes will be freed by the collector
    list<INT> *obj2 = new list<INT>;

    //allocate a Bar object with operator 'new' 
    //that is automatically finalized
    //since it inherits from Object
    Bar *obj3 = new Bar;
}

The library will collect garbage automatically as soon as more than GC_THRESHOLD bytes have not been collected.

  • Note 1: The library has been developed and tested under MSVC++ 6.0. It will probably run under other compilers, too. If you want to test it and tell me, feel free to do so: you will be added in the list of contributors.
  • Note 2: Exceptions must be enabled.
  • Note 3: The solution does not work for multithreaded applications (yet).

For more information, you may visit my little site here.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

About the Author

Achilleas Margaritis
Software Developer (Senior)
Greece Greece
Member
No Biography provided

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
GeneralRe: Garbagememberaxilmar3 Aug '04 - 6:21 
Nope. Real measurement. The bigger the project is, the more time we spent dealing with memory errors. With garbage collection, all this time is spent productively.
 
I think the next C++ standard must support GC (optionally, through an attribute).

GeneralRe: GarbagememberNathan Holt at CCEI3 Aug '04 - 6:53 
axilmar wrote:
Nope. Real measurement. The bigger the project is, the more time we spent dealing with memory errors. With garbage collection, all this time is spent productively.
 
That's interesting. I use smart pointers (usually not reference counted), and STL containers, so I hardly ever have to explicitely call delete. Do you normally use plain pointers when not using your GC to get this improvement?
 
Nathan Holt
GeneralRe: GarbagememberNemanja Trifunovic3 Aug '04 - 6:56 
Same here. Haven't had a memory problem for ages.
 


My programming blahblahblah blog. If you ever find anything useful here, please let me know to remove it.
GeneralRe: Garbagememberaxilmar3 Aug '04 - 23:52 
Our applications have various types of object models with parent-child relations(and many other types of cyclic relations, such as circular lists); smart pointers can't be used there.

GeneralRe: GarbagememberNathan Holt at CCEI4 Aug '04 - 5:47 
axilmar wrote:
Our applications have various types of object models with parent-child relations(and many other types of cyclic relations, such as circular lists); smart pointers can't be used there.
 
It's interesting that neither STL nor boost seem to have a circular list template. I've used std::list for that, with special functions to get next and previous elements in a circular manner, but it seems that a container could be designed to make this easier.
 
I notice you mentioned parent-child relationships. It seems to me that smart pointers and containers can easily handle such a relationship. If a parent object uses scoped pointers to point to its children, or keeps them in an STL container, the child objects will automatically be deleted when the parent is.
 
In general, I try to give child objects as little information as possible about their parents in order to make them more reuseable.
 
Nathan HOlt
GeneralRe: Garbagememberaxilmar4 Aug '04 - 6:44 
It's interesting that neither STL nor boost seem to have a circular list template
 
I've made one myself.
 
It seems to me that smart pointers and containers can easily handle such a relationship
 
Nope, they can't. Suppose that you have:
 
class Parent {
public:
    SmartPtr<Child> child;
};
 
class Child {
public:
    SmartPtr<Parent> parent;
};
 
int main()
{
    SmartPtr<Parent> p = new Parent;
    p->child = new Child;
}
 
In the above example, the parent and child allocated will never be freed (because the ref count of parent will never drop to 0).
 
In general, I try to give child objects as little information as possible about their parents in order to make them more reuseable.
 
In general, me too. But in practice, children need to address the parent's interfaces, so it is not so easy.
 
There is also the other problem of double deletions. For example:
 
class Foo {
public:
    SmartPtr<Foo> foo;
};
 
int main()
{
    Foo foo1;
    foo1.foo = &foo1;
}

 
In the above code, when the Foo.foo smart pointer will be deleted, it will delete the object it belongs!
 
When I talk about GC, I never throw out other memory management solutions. The advantage of C++ is that you can use whatever method it suits the problem to manage memory. It happens that GC is the easiest to use and most generic solution, though.

GeneralRe: GarbagememberNathan Holt at CCEI4 Aug '04 - 7:17 
axilmar wrote:
Nope, they can't. Suppose that you have:
class Parent {
public:
    SmartPtr child;
};
 
class Child {
public:
    SmartPtr parent;
};
 
int main()
{
    SmartPtr p = new Parent;
    p->child = new Child;
}

 
If the pointer to the parent is needed, why should it be a smart pointer? Also, unless the parent is sharing its child, its pointer would not be a reference counting pointer, but instead be a simpler scoped pointer like boost::scoped_ptr<Child>
 
In the rare cases in which a parent class does share its child classes with the possibility of outliving their parent, I have made them communicate with their parent with a signal/slot type system, which is designed to automatically disconnect when either object is destroyed.
 

axilmar wrote:
There is also the other problem of double deletions. For example:
 
class Foo {
public:
    SmartPtr foo;
};
 
int main(){
    Foo foo1;
    foo1.foo = &foo1;
}
 
In the above code, when the Foo.foo smart pointer will be deleted, it will delete the object it belongs!

 
I see what you meant by double deletions. I don't think I've ever created a reference counted object on the stack. For that matter, hardly any of my reference counted objects have contained reference counting pointers.
 
Nathan Holt
GeneralRe: Garbagememberaxilmar5 Aug '04 - 0:05 
If the pointer to the parent is needed, why should it be a smart pointer?
 
Because something else might point to parent with a shared pointer.
 
I have made them communicate with their parent with a signal/slot type system
 
Signals and slots is about callbacks. You don't have compile-time access to the interfaces of either the caller or the callee.
 
I don't think I've ever created a reference counted object on the stack. For that matter, hardly any of my reference counted objects have contained reference counting pointers.
 
It all depends on what you are doing. What I showed you was just an example; an inexperienced programmer might declare a reference-counted object on the stack. In one of the apps for our customers, we had a large object model with many inter-relationships between classes, and smart pointers created many problems.
 
GC is not a panacea, but it is a good solution that is generic enough to:
 
1) not demand specific constructs
2) allow for retrofitting of previously written programs
 
It is a fire-and-forget solution with specific advantages and a few sort disadvantages.

GeneralRe: GarbagememberNathan Holt at CCEI5 Aug '04 - 6:44 
axilmar wrote:
If the pointer to the parent is needed, why should it be a smart pointer?
 
Because something else might point to parent with a shared pointer.

 
I don't understand that. If the parent is tracked with shared pointers, it shouldn't make much difference, since when the last shared pointer is destroyed and the parent object is deleted, it should still delete the child object, along with the reference. If the child object is shared, the parent can still tell the child to remove its reference one way or another. There are a number of classes around to automate this.
 

axilmar wrote:
I have made them communicate with their parent with a signal/slot type system
 
Signals and slots is about callbacks. You don't have compile-time access to the interfaces of either the caller or the callee.

 
I am not sure what you mean by this. The system I rolled for myself was template based and strongly typed. I used it because it made it easy to design the child object without relying on details of the parent, thus making the child reuseable.
 

axilmar wrote:
It all depends on what you are doing. What I showed you was just an example; an inexperienced programmer might declare a reference-counted object on the stack. In one of the apps for our customers, we had a large object model with many inter-relationships between classes, and smart pointers created many problems.
 
That makes sense. I guess being the only programmer at my company saves me from that. Smile | :) Fortunately, I've been able to force the object models of my projects into a hierarchy that saves me from the worst of the inter-relationships. I don't think I've ever worked on a really large project.
 
axilmar wrote:
GC is not a panacea, but it is a good solution that is generic enough to:
 
1) not demand specific constructs
2) allow for retrofitting of previously written programs
 
It is a fire-and-forget solution with specific advantages and a few sort disadvantages.

 
I'm sure its useful, but I don't think its quite a fire and forget solution. Even when deleting objects in the order that they're allocated, I think its possible to end up with a destructor trying to delete an object that's already deleted. For instance, std::list::splice can move elements from one list to another.
 
I'll admit that some of my suspicions come from issues with .net languages, in which there are issues with making objects GC safe. In particular, finalizers have to assume that all the pointers are invalid, because the GC may have already deleted them.
 
Nathan Holt
GeneralRe: Garbagememberaxilmar5 Aug '04 - 10:04 
There are a number of classes around to automate this.
 
Sure there are, but in order to avoid the fuss, I prefer GC.
 
I am not sure what you mean by this.
 
The child needs to call methods of the parent.
 
I guess being the only programmer at my company saves me from that.
 
I understand your position. I was like that before our company employ metrics. Unfortunately, when your product measures 500,000 lines of code and above, it is very difficult to use effectively all the C++ tricks from all programmers.
 
Even when deleting objects in the order that they're allocated, I think its possible to end up with a destructor trying to delete an object that's already deleted. For instance, std::list::splice can move elements from one list to another.
 
It's easy to fix that: at first all objects are finalized, and then the remaining objects are destroyed.
 
I'll admit that some of my suspicions come from issues with .net languages
 
There is no problem with garbage collection: it works. It may be that my implementation may have a bug or too, but I will fix them. No piece of code is without bugs, at least at first stages of its life.
 
Thanks for the tips.

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web04 | 2.6.130523.1 | Last Updated 31 Jul 2004
Article Copyright 2004 by Achilleas Margaritis
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid