|
|
1. cyclic references.
This is really a design problem. It has nothing to do with smart pointer nor reference counting technique. n-thread always require certain locking scheme to be implement.
2. slow speed of execution.
I can't see how garbage collection library can execute "faster"? In fact, it waste many more resource at runtime due to the threshold.
3. difficulty in programming.
Well, if a C++ system cannot determine a state of an object, it only make it even more difficult to program. Garbabge collection sounds, but not really necessary in most cases.
Lastly, how does your GC better than .net GC? Cross platform maybe, that's the power of C/C++, not really a scheme that automatically delete pointer.
Just some quick thoughs, good try though.
|
|
|
|
|
This is really a design problem.
No, it is not. Objects that point to each other either directly or indirectly will never be freed. Google for it, if you don't believe me.
I can't see how garbage collection library can execute "faster"?
It's simple: there are thousands of assignments in an application taking place every second, but there are much less allocations...especially in C++ that there are static objects.
Garbabge collection sounds, but not really necessary in most cases.
That's what I was saying before I found out my project times decreased 50% with garbage collection.
Lastly, how does your GC better than .net GC?
I never said that it does.
|
|
|
|
|
axilmar wrote:
I found out my project times decreased 50% with garbage collection.
I find this very surprising. Is this your subjective feeling, or you actually measured project times?
In my experience, GC (and the choice of programming language in general) has very little to do with productivity, except maybe with very small projects. It is mostly the matter of good/bad project management that makes a project finish on time.
My programming blahblahblah blog. If you ever find anything useful here, please let me know to remove it.
|
|
|
|
|
Nope. Real measurement. The bigger the project is, the more time we spent dealing with memory errors. With garbage collection, all this time is spent productively.
I think the next C++ standard must support GC (optionally, through an attribute).
|
|
|
|
|
axilmar wrote:
Nope. Real measurement. The bigger the project is, the more time we spent dealing with memory errors. With garbage collection, all this time is spent productively.
That's interesting. I use smart pointers (usually not reference counted), and STL containers, so I hardly ever have to explicitely call delete. Do you normally use plain pointers when not using your GC to get this improvement?
Nathan Holt
|
|
|
|
|
|
Our applications have various types of object models with parent-child relations(and many other types of cyclic relations, such as circular lists); smart pointers can't be used there.
|
|
|
|
|
axilmar wrote:
Our applications have various types of object models with parent-child relations(and many other types of cyclic relations, such as circular lists); smart pointers can't be used there.
It's interesting that neither STL nor boost seem to have a circular list template. I've used std::list for that, with special functions to get next and previous elements in a circular manner, but it seems that a container could be designed to make this easier.
I notice you mentioned parent-child relationships. It seems to me that smart pointers and containers can easily handle such a relationship. If a parent object uses scoped pointers to point to its children, or keeps them in an STL container, the child objects will automatically be deleted when the parent is.
In general, I try to give child objects as little information as possible about their parents in order to make them more reuseable.
Nathan HOlt
|
|
|
|
|
It's interesting that neither STL nor boost seem to have a circular list template
I've made one myself.
It seems to me that smart pointers and containers can easily handle such a relationship
Nope, they can't. Suppose that you have:
class Parent {
public:
SmartPtr<Child> child;
};
class Child {
public:
SmartPtr<Parent> parent;
};
int main()
{
SmartPtr<Parent> p = new Parent;
p->child = new Child;
}
In the above example, the parent and child allocated will never be freed (because the ref count of parent will never drop to 0).
In general, I try to give child objects as little information as possible about their parents in order to make them more reuseable.
In general, me too. But in practice, children need to address the parent's interfaces, so it is not so easy.
There is also the other problem of double deletions. For example:
class Foo {
public:
SmartPtr<Foo> foo;
};
int main()
{
Foo foo1;
foo1.foo = &foo1;
}
In the above code, when the Foo.foo smart pointer will be deleted, it will delete the object it belongs!
When I talk about GC, I never throw out other memory management solutions. The advantage of C++ is that you can use whatever method it suits the problem to manage memory. It happens that GC is the easiest to use and most generic solution, though.
|
|
|
|
|
axilmar wrote:
Nope, they can't. Suppose that you have:
class Parent {
public:
SmartPtr child;
};
class Child {
public:
SmartPtr parent;
};
int main()
{
SmartPtr p = new Parent;
p->child = new Child;
}
If the pointer to the parent is needed, why should it be a smart pointer? Also, unless the parent is sharing its child, its pointer would not be a reference counting pointer, but instead be a simpler scoped pointer like boost::scoped_ptr<Child>
In the rare cases in which a parent class does share its child classes with the possibility of outliving their parent, I have made them communicate with their parent with a signal/slot type system, which is designed to automatically disconnect when either object is destroyed.
axilmar wrote:
There is also the other problem of double deletions. For example:
class Foo {
public:
SmartPtr foo;
};
int main(){
Foo foo1;
foo1.foo = &foo1;
}
In the above code, when the Foo.foo smart pointer will be deleted, it will delete the object it belongs!
I see what you meant by double deletions. I don't think I've ever created a reference counted object on the stack. For that matter, hardly any of my reference counted objects have contained reference counting pointers.
Nathan Holt
|
|
|
|
|
If the pointer to the parent is needed, why should it be a smart pointer?
Because something else might point to parent with a shared pointer.
I have made them communicate with their parent with a signal/slot type system
Signals and slots is about callbacks. You don't have compile-time access to the interfaces of either the caller or the callee.
I don't think I've ever created a reference counted object on the stack. For that matter, hardly any of my reference counted objects have contained reference counting pointers.
It all depends on what you are doing. What I showed you was just an example; an inexperienced programmer might declare a reference-counted object on the stack. In one of the apps for our customers, we had a large object model with many inter-relationships between classes, and smart pointers created many problems.
GC is not a panacea, but it is a good solution that is generic enough to:
1) not demand specific constructs
2) allow for retrofitting of previously written programs
It is a fire-and-forget solution with specific advantages and a few sort disadvantages.
|
|
|
|
|
axilmar wrote:
If the pointer to the parent is needed, why should it be a smart pointer?
Because something else might point to parent with a shared pointer.
I don't understand that. If the parent is tracked with shared pointers, it shouldn't make much difference, since when the last shared pointer is destroyed and the parent object is deleted, it should still delete the child object, along with the reference. If the child object is shared, the parent can still tell the child to remove its reference one way or another. There are a number of classes around to automate this.
axilmar wrote:
I have made them communicate with their parent with a signal/slot type system
Signals and slots is about callbacks. You don't have compile-time access to the interfaces of either the caller or the callee.
I am not sure what you mean by this. The system I rolled for myself was template based and strongly typed. I used it because it made it easy to design the child object without relying on details of the parent, thus making the child reuseable.
axilmar wrote:
It all depends on what you are doing. What I showed you was just an example; an inexperienced programmer might declare a reference-counted object on the stack. In one of the apps for our customers, we had a large object model with many inter-relationships between classes, and smart pointers created many problems.
That makes sense. I guess being the only programmer at my company saves me from that. Fortunately, I've been able to force the object models of my projects into a hierarchy that saves me from the worst of the inter-relationships. I don't think I've ever worked on a really large project.
axilmar wrote:
GC is not a panacea, but it is a good solution that is generic enough to:
1) not demand specific constructs
2) allow for retrofitting of previously written programs
It is a fire-and-forget solution with specific advantages and a few sort disadvantages.
I'm sure its useful, but I don't think its quite a fire and forget solution. Even when deleting objects in the order that they're allocated, I think its possible to end up with a destructor trying to delete an object that's already deleted. For instance, std::list::splice can move elements from one list to another.
I'll admit that some of my suspicions come from issues with .net languages, in which there are issues with making objects GC safe. In particular, finalizers have to assume that all the pointers are invalid, because the GC may have already deleted them.
Nathan Holt
|
|
|
|
|
There are a number of classes around to automate this.
Sure there are, but in order to avoid the fuss, I prefer GC.
I am not sure what you mean by this.
The child needs to call methods of the parent.
I guess being the only programmer at my company saves me from that.
I understand your position. I was like that before our company employ metrics. Unfortunately, when your product measures 500,000 lines of code and above, it is very difficult to use effectively all the C++ tricks from all programmers.
Even when deleting objects in the order that they're allocated, I think its possible to end up with a destructor trying to delete an object that's already deleted. For instance, std::list::splice can move elements from one list to another.
It's easy to fix that: at first all objects are finalized, and then the remaining objects are destroyed.
I'll admit that some of my suspicions come from issues with .net languages
There is no problem with garbage collection: it works. It may be that my implementation may have a bug or too, but I will fix them. No piece of code is without bugs, at least at first stages of its life.
Thanks for the tips.
|
|
|
|
|
axilmar wrote:
The child needs to call methods of the parent.
You still haven't explained why the child would need to call methods of the parent directly, instead of through a signal/slot system.
axilmar wrote:
I understand your position. I was like that before our company employ metrics. Unfortunately, when your product measures 500,000 lines of code and above, it is very difficult to use effectively all the C++ tricks from all programmers.
That makes sense. Thanks for the insight.
axilmar wrote:
It's easy to fix that: at first all objects are finalized, and then the remaining objects are destroyed.
That should solve the problem of multiple memory deletions, but I think objects could still be finalized twice, in that a list element could be finalized, and then when the list is finalized, it finalizes all its elements as it deletes them.
axilmar wrote:
Thanks for the tips.
You're welcome.
Nathan Holt
|
|
|
|
|
You still haven't explained why the child would need to call methods of the parent directly, instead of through a signal/slot system.
I did not need the functionality of signals and slots, I needed to call a method of the parent. Signals and slots is a callback mechanism.
but I think objects could still be finalized twice
The new version 2.5 avoids extra finalizations and deletions. Objects are deleted in an independent step; memory will not be freed twice. Furthermore, if an object is already finalized, it is not finalized for 2nd time.
For example, in a linked list, a node may be finalized first (i.e. its destructor being called), but will not be freed. Then the list deletes the node, calls its destructor (the destructor is called twice, but this can not be avoided in any C++ GC because there is no control over destructor calling), but the node is still not freed. So we don't have the double deletion problem.
After all finalizations complete, then memory is freed.
By the way, do you know of any way that I can tell if a pointer is wild under Unix? i.e. if I used the pointer, it would cause a SIGSEV or SIGBUS; but I want to know that before I use it. On Windows I use structured exception handling and the functions IsBadReadPtr and IsBadWritePtr, but I can't find something similar for Unix. If you know something, I could port the collector to Unix.
|
|
|
|
|
axilmar wrote:
I did not need the functionality of signals and slots, I needed to call a method of the parent. Signals and slots is a callback mechanism.
I agree that some classes have so little potential for reuse that there is no point in a signal/slot method. Is that what you were trying to say there?
axilmar wrote:
For example, in a linked list, a node may be finalized first (i.e. its destructor being called), but will not be freed. Then the list deletes the node, calls its destructor (the destructor is called twice, but this can not be avoided in any C++ GC because there is no control over destructor calling), but the node is still not freed. So we don't have the double deletion problem.
Calling destructors multiple times is exactly what I was refering to. In many classes, this can be done safely, but there are reasonable situations in which this could be disastrous. If a destructor is virtual, for instance, the compiler changes the pointer to the virtual member table as the more derived parts of the object are destroyed, and I would consider it quite reasonable to set the pointer to NULL once an object is destroyed. Then, the second time the destructor was called, code would attempt to find the address of the virtual destructor by using a NULL pointer.
axilmar wrote:
By the way, do you know of any way that I can tell if a pointer is wild under Unix? i.e. if I used the pointer, it would cause a SIGSEV or SIGBUS; but I want to know that before I use it. On Windows I use structured exception handling and the functions IsBadReadPtr and IsBadWritePtr, but I can't find something similar for Unix. If you know something, I could port the collector to Unix.
I'm afraid it's been years since I programed in UNIX. I would not be suprised if the functions needed were different for each form of UNIX.
Nathan Holt
|
|
|
|
|
|
Because the object might need reference counting, due to being managed from another object outside the cycle.
|
|
|
|
|
There was a bug in finalization during cleanup. I fixed it. Version 2.1 is available from my site.
|
|
|
|
|
Nice work but unfortunately it does not work properly if you try to compile
it with a MFC app in debug mode due to the overloaded new operator which is used
by MFC to check for memory leaks.
|
|
|
|
|
The only solution is to hack MFC not to use in debug mode its own 'new' operator. From what I know, 'operator new' is always part of the global namespace.
|
|
|
|
|
It looks good but, I have a few questions. I noticed that it is very similar to the Hans J.Boehm implementation without the constructs for C heap allocation (malloc / free) and the generic prompts for garbage collection.
http://www.hpl.hp.com/personal/Hans_Boehm/gc/
What are the advantages of your implementation over this one?
I'm a bit confused. Is it meant to be a subset or simplified interface?
"Good strategies are long on detail and short on vision." - Lou Gerstner
|
|
|
|
|
More of the author's comments about this library can be found at...
http://www.allegro.cc/[^]
Theres an update to the library presented here. It looks like version 2 is more or a drop in replacement for "New/Delete" and I like it better. It shows that the author is starting to understand the design he barrowed. Now if I can only convice him to give them credit...
As a side note: be carefull of pointer members in static destructors of objects in the global space, the wrong implementation of Delete can be called!
|
|
|
|
|
More of the author's comments about this library can be found at...
Yep, in my free time I program freeware games!
The version presented here and the updated version in the Allegro site is the same.
It shows that the author is starting to understand
I understood the concept (for all types of collections: mark-and-sweep, mark-and-copy, mark-and-compact) long before I made any try to make one of my own. What I couldn't find out is how to find the top of the stack, and the global data space. That's why the first version of the library used special pointer classes (as some other libs about GC in this site). Now that I have found out how to do so, I removed the need for special pointer classes.
the design he barrowed
Now if I can only convice him to give them credit...
My code has nothing borrowed from the Hans-Boehm collector. It is brand new code. I tried to study the HB GC code, but it is quite unreadable to me, full of macros etc (I don't blame them; it may not be possible to be cross platform in any other way for such low-level stuff). Therefore I don't have to give any credit to those guys, since I implemented algorithms that are in the public domain for a long time now.
the wrong implementation of Delete can be called!
Could you please elaborate?
Furthermore, if anyone knows how to a) suspend and resume all threads b) find the stack of a thread, I can make the library multithreaded!
|
|
|
|
|