Superior container classes: Smaller, faster, flexible, convenient

valdok

4.83/5 (20 votes)

Aug 18, 2007

CPOL

27 min read

86949

776

Alternative container classes implementation, a different approach.

Preface

This article assumes the reader is generally familiar with the subject of container classes. Experience with some common container class libraries (such as STL and MFC) is desirable, though not mandatory.

Introduction

To manipulate program data, you usually need an appropriate data structure. This can be an array, single/double direction linked list, tree, hash table, and etc. Depending on the requested functionality, we can choose one of those. I've been using STL for a long time. It supports linked lists (queue, list, stack), auto-growing arrays (vector), binary search tree (map), and a sophisticated linked list of arrays (dequeue). What it lacks is a map based on a hash table, but in most of the cases, binary trees can be used instead.

Once I had to write a device driver. And if you know, using libraries in drivers is a real pain in the neck. In kernel mode, you have a lot of limitations related to memory allocation, exception handling, paged code execution, and etc. Plus, some libraries need some global init/uninit, and they think they're clever enough to do it on-demand, whereas in some circumstances, drivers are not allowed to call some functions (related to IRQL). So, using STL (which is pretty bombastic) is kinda problematic there. Not to mention such a monster as MFC.

So, I decided to write my own container classes to manipulate my data structures. And, I wanted to achieve the following:

Performance. Make them fast and memory compact.
Make them generic.
Small machine code, avoid repetitions.
Convenient usage.

First, I took the code from STL and MFC container classes, and made some modifications to make it appropriable for my case. But then, at some point, I realized that those classes could be greatly improved. I'm not talking about the algorithm behind the binary tree or the linked list, but about the idea of how to design and use those classes. By using another design, I've improved the performance greatly, made the code smaller, and even more generic. And, I like my container classes so much that I don't use neither STL nor MFC containers in any of my applications since then.

How template classes are usually designed

Let's take a look at the list container class of STL. This is how it is declared:

template <
   class Type, 
   class Allocator=allocator<type> 
>
class list
{
// ...

};

In order to instantiate a linked list, you must specify the type you want to store in it, and optionally the allocation mechanism.

Now, how does this list actually store data? As we all know, in order to store a data element in a linked list, you have to construct another element, called Node. This node must contain the element we want to store, and pointers to prev/next nodes. So, the list in fact does not store elements you want to have in it, it instead stores elements of its own type, which in turn have elements of our type as members. If you look at the code of the STL list class, you'll find the following declaration:

struct _Node {
    _Nodeptr _Next, _Prev;
    _Ty _Value;
    };
// ...

_Nodeptr _Head;
size_type _Size;

Surprisingly, there's no _Tail member. That's because in this implementation, _Head is not really a pointer to the first inserted node. Instead, _Head is a pointer to a fictive node which is allocated at construction time. Its _Next and _Prev members are in fact the head and the tail of the linked list, and its _Value member is ignored.

Next, let's talk about iterators. It can be said that an iterator is an identifier of an element in the container class. (MFC container classes use a term POSITION). That is, when you insert an element to a list, you get in return its iterator. By using this iterator, you can get the value of the element you've inserted into the list. You can also erase this element from the list using its iterator. In our case, we can say the _NodePtr is an iterator. In other words, if you have _NodePtr, you can get the value of the element, and you can erase or reposition it in the list. In conclusion, we can say the following:

Container classes usually declare their own type which wraps the requested type. For every element, they allocate their companion structure, and store the requested element in it.
Iterator is an identifier of the element in the container. It is usually a pointer to the element's companion structure.

Is this design effective?

As we know, when you insert an element, the container allocates its own structure which contains the member of your type, and assigns this member the element you provide. That is, it does not store your element, instead it copies its value into another element.

Sometimes, you need to store simple types in the container, such as int. But usually, you have to store more complex elements. For example, you may want to store strings, or some objects that include strings, buffers, and pointers to other objects. In such a case, assignment is not a trivial operation. If your object contains strings, dynamic buffers, and etc., it can't be "just copied". You have to provide an assignment operator for such objects.

The meaning of assignment is to bring the object to the state which is identical by some definition to the state of another object. This usually means duplicating strings, buffers, and etc. Unfortunately, this approach is widely used. I've seen dozens of code like this:

struct UserInfo {
    // ...

    PCSTR m_szName;

    // ...

    // To make it container-compliant

    UserInfo& operator = (const UserInfo& other)
    {
        free(m_szFirstName);
        m_szFirstName = strdup(other.m_szFirstName);
    }
};

In my opinion, this is ridiculous. Duplicate the object, including all its members (which may be complex objects in turn), just to insert it into the list!! It can be justified only if this is what you really need: preserve the state of your object in the linked list, and continue to modify your object. That is, what you need is really duplication of the object's state.

But sometimes, the situation is even more cruel. Suppose your object has a pointer/reference to another object. What should you do in the assignment then? Should you just assign the pointer, or duplicate the pointed object? And, if that object contains pointers to other objects too? Sometimes, the assignment is not only performance-killing, it's simply inappropriate. There's no logical context for the assignment.

A more correct way to handle complex objects is to make the container class contain pointers (or smart pointers) to objects instead of objects themselves. This way, you don't need to "invent" assignment for your objects. You can also insert the same object into different containers so that you can fetch it from one container and modify its members, then eventually take it from another container, and the changes are visible. Unfortunately, because this requires a bit more complex syntax to work with, many people neglect this rule.

Let's now look at the following example (taken from real life).

Suppose you have an object that exists (via pointers, not duplicated) in multiple containers. For example, you have a server. It contains a linked list of all clients connected to it. The list is sorted by the last activity time. (Clients that have sent something recently are at the head of the list.) To maintain the list in this order whenever we receive something from the client, we move it to the head of the list. Then, once in a while, we can walk through the list from tail to head and drop those clients that didn't send us anything for a long time, up until we encounter the first "valid" client.

Apart from this list, let's assume that every client has an ID, and we want a fast search by this ID. We need a sort of a map: either hash table or tree.

Suppose also that our server should immediately drop any client if it receives an I/O error for it. We have the following problematic scenarios:

Suppose we walk through our last activity list and drop clients. Dropping means that we must remove this client from all the containers and delete its structure itself. We have the list iterator of this client (since we walk through this list right now), hence we can remove it from the list immediately. But in order to remove it from the map, we have to find it in the map first. Well, this is not a big problem since a map provides fast search, but still it's a bit slower than it could be.
We want to send something to a client with a specific ID. We locate it through the map, and attempt to send to it. But then, eventually, we receive an I/O error. By our policy, we should drop it. There's no problem to remove it from the map; we've just found it and have its iterator, but there is a problem now to remove it from the list: we now have to walk through the entire list and search for this client.
At some point, we have a pointer to the client structure, and eventually, we need its ID. But, it is not a member of the client structure. It is a member of the map's companion structure (goes as search key there), which we don't have.

How can these disadvantages be solved?

Answer: Add to your structure iterators that identify it in all containers it belongs to.

In our case, we have to add two members to our client structure: list iterator and map iterator. At construction time, we insert our structure to both containers, receive in return iterators, and store those iterators as members. Then, whenever we have a pointer to this structure, we can quickly perform any list/map operation on it.

This situation is a bit weird however: you have an object that has an iterator as a member. The iterator itself is a kind of a pointer to another structure that has the pointer to our object. So, we have a cross-linked data structure.

Redesign 1

When we have two (or more) cross-linked structures, we can merge them into one struct, can't we? Suppose we have a client structure ClientInfo. Let's merge some members into it:

struct ClientInfo {
    // some common members.
    // ...
    // Activity linked list manipulation:

    ClientInfo* m_pActivityNext;
    ClientInfo* m_pActivityPrev;

    // ID lookup manipulation, binary tree engine:

    ULONG m_LookupID;           // serves as key
    ClientInfo* m_pLookupTop;   // Tree parent
    ClientInfo* m_pLookupLeft;  // Tree left child
    ClientInfo* m_pLookupRight; // Tree right child
    int m_nLookupBalance; // Needed to auto-ballance the binary tree

    // (AVL method).

};
struct ServerGlobalState {
    // ...
    // This is how our "containers" now look:
    // Activity "list"

    ClientInfo* m_pActivityFirst;
    ClientInfo* m_pActivityLast;

    // Lookup "tree"
    ClientInfo* m_pLookupRoot;

};

We got rid of the list node and the tree structures. Instead, we've put their members into our struct. What exactly did we achieve by this? Actually, several things, but the most important point is that we don't need iterators any more. The object is an iterator to itself for all the data structures it belongs to. Once you've found an object in one data structure, you've found it in all the data structures. Let's compare this approach to the standard containers:

Pros:

Better memory utilization. We allocate a single structure for everything. Since memory for multiple objects with unpredictable lifetimes is usually allocated on the heap, every such allocation has significant memory overhead.
Better performance. Heap allocation is a time-consuming operation; plus, in multithreaded applications, heap access requires synchronization.
The object is an iterator. Once you have a pointer to an object, you can manipulate it in all the data structures it belongs to.

Cons:

Not generic. What we've made can hardly be called a general-purpose container class. To arrange another structure in some container class, we'll have to add all those members to it, and write all the functions for it, which is not what we want. Well, handling linked list operations is relatively simple, but not writing tree management for every class.

Redesign 2

struct LinkList {

    // node declaration

    struct Node {
        Node* m_pNext;
        Node* m_pPrev;
    };

    // c'tor

    LinkList()
        :m_pHead(NULL)
        ,m_pTail(NULL)
    {
    }

    // operations

    void InsertHead(Node&);
    void InsertTail(Node&);
    void InsertAfter(Node& node, Node* pAfter);
    // ... other variants of insertion

    // ...

    void Remove(Node&);

    Node* GetHead() { return m_pHead; }
    Node* GetTail() { return m_pHead; }

private:
    Node* m_pHead;
    Node* m_pTail;
};
template <class KEY, class ARG_KEY>
struct AvlTree {

    // node declaration

    struct Node {
        KEY m_Key;
        Node* m_pT;
        Node* m_pL;
        Node* m_pR;
        int m_nBallance;
    };

    // c'tor

    AvlTree ()
        :m_pRoot(NULL)
    {
    }
    
    // operations

    void Insert(Node&);
    void Remove(Node&);

    Node* Find(ARG_KEY);

private:
    Node* m_pRoot;

};
typedef AvlTree<ULONG,ULONG> AvlTreeULong;

struct ClientInfo
    :public LinkList::Node
    ,public AvlTreeULong::Node
{
    // some common members.

    // ...

};

In this approach, we now have dedicated container classes. We've managed to move the container functionality out of our data structure. Instead, our data structure gets all the needed members by inheriting the appropriate container's Node. Our container classes manage some objects, without actually knowing what those objects are! That is, we can declare another structure that inherits LinkList::Node, and use exactly the same LinkList class to handle those objects too. Moreover, we can mix objects of different kinds into one linked list if we want.

Note that STL and MFC container classes require you to provide the type you're going to store as a template argument. And, if you instantiate two linked lists of different data structures, you'll get (behind the scenes) two different container classes with different machine code. In our approach, however, we have a single machine code that works for all the types.

Our AvlTree class does have template arguments, but it's only related to the lookup key (obviously, for different key types, tree code should be different). But, for different data structures that use the same key type, we instantiate the same AvlTree class. In opposite, the MFC's hash table takes KEY, ARG_KEY, VALUE, and ARG_VALUE parameters, the STL map class takes Key and Type.

Another great advantage of our container classes, in my opinion, is that they don't allocate/free memory. No memory management at all! They don't create/destroy structures, they just manage their connections to each other.

Unfortunately, we have a little drawback. Our container classes manage objects that inherit their Node, without knowing what those objects are. As a result, when you call some function that is supposed to return you a contained object (such as a Tree's Find() function), you get in return a pointer to the Node, not to the actual object type. Then you should cast the returned pointer to the actual type. That is:

AvlTreeULong map;
LinkList lst;
// ...
    
ClientInfo* pClient;
pClient = map.Find(117); // Error.

// Find() returns you a AvlTreeULong::Node*, not a ClientInfo*
pClient = (ClientInfo*) map.Find(117); // Correct
pClient = lst.GetHead(); // Error.

// lst.GetHead() returns you a LinkList::Node*, not a ClientInfo*
pClient = (ClientInfo*) lst.GetHead(); // Correct

ClientInfo* pActivityNext;
pActivityNext = pClient->m_pNext; // Error.

// m_pNext points of LinkList::Node, not to a ClientInfo
pActivityNext = (ClientInfo*) pClient->m_pNext; // Correct

Another relative disadvantage of our container classes is also related to the fact that they don't know what they actually manage. In STL and MFC container classes, we can leave some objects, and in the destructor, they are cleaned up. Our containers lack this functionality, because in order to destroy an object, we have to know what it is. We must clean our containers manually. Conclusion:

Pros:

All we've mentioned earlier.
Absolutely generic.
No memory management. Also means no out-of-memory situation handling in-class. This is really a gift for driver writers.
Same code for different data types. Smaller code.

Cons:

More ugly syntax to work. Requires type castings.
Container class can't auto create/delete objects. Means no automatic cleanup is possible.

Redesign 3

template <class CastNode>
struct LinkListEx
    :public LinkList
{
    // Insertion/Remove don't require extra casting.

    // But retreive methods do.

    CastNode* GetHead() { return (CastNode*) LinkList::GetHead(); }
    CastNode* GetTail() { return (CastNode*) LinkList::GetTail(); }

    static CastNode* GetNext(CastNode& node)
        { return (CastNode*) node.m_pNext; }
    static CastNode* GetPrev(CastNode& node)
        { return (CastNode*) node.m_pPrev; }
};

template <class CastNode>
struct LinkListDyn
    :public LinkListEx<CastNode>
{
    // c'tor

    ~LinkListEx()
    {
        Clear(); // Auto-cleanup

    }

    // We can now add some functions that allocate new elements.

    CastNode* CreateHead()
    {
        CastNode* pVal = new CastNode;
        InsertHead(*pVal);
        return pVal;
    }
    // ...


    // Functions that destroy elemetns.

    void Delete(CastNode* pNode)
    {
        ASSERT(pNode); // NULL is illegal for this function.

        Remove(*pNode);
        delete pNode;
    }

    void Clear()
    {
        for (CastNode* pNode; pNode = GetHead(); )
            Delete(pNode);
    }

};

We've described two extensions for our LinkList class. The first extension just provides a cast to the actual type. This allows to use the normal convenient syntax (without those infinite castings). Although this is a template class that may be instantiated for different data types, it does not produce extra code. It just automatically does what otherwise you'd have to do. The engine is still the LinkList class, which is single.

The second extension goes farther. It now handles objects creation/destruction. Hence, we can now have auto-cleanup, integrated CreateAndInsert and RemoveAndDelete functions, and etc.

Worth to add that this extension is sometimes convenient and handy, but not always appropriate. For example, in our case, where ClientInfo belongs to both LinkList and AvlTreeULong, we can't just let our container classes automatically destroy our objects. First of all, the same object belongs to them both, and that means that only one of them should delete it. Also, in our case, when we delete an object, it must be removed from both containers. Hence, our destruction of objects is non-trivial. Let's rather clean them manually.

Of course, similar extensions can be made for other container types too. Conclusion:

Pros:

All we've mentioned earlier.
Good syntax due to the actual type template argument, still small code.
Can easily be extended to Create/Delete/Cleanup. Still flexible: you can remove an object from the container without destroying it.

No cons?

Let's list all our advantages once more:

Object is iterator. Once you have an object, you can manipulate it in all the data structures it belongs to.
Better memory utilization, less overhead/fragmentation.
Better performance (less heap operations).
Smaller code.
No internal memory management. Means - more generic, can be used in special conditions (such as drivers).
Can be extended to Create/Delete/Cleanup, yet flexible.
Good syntax.

In real life, there're always cons. Not always significant, but should be mentioned.

In our approach, we integrate the container Node's members into our structure. This means that, in fact, we have direct access to them. Hence, they can be corrupted by improper use. Well, this can be partially fixed. All you have to do is declare those members private in the container's Node, and optionally provide read-only public functions to them. But anyway, improper memory access can corrupt them. More likely than in the STL/MFC container.

More complex situations: suppose you've inserted an element to the LinkList. What if at some point you "forget" that this element is already in the list, and insert it again to the same list? Or delete it without removing it from the LinkList? The list will become corrupted most likely (skipped elements, closed circles, broken pointers).

Our approach puts more responsibility on the programmer. You have to know exactly where your objects belong to. By the way, the same applies to standard container classes too (and to programming in general, and not just here), you must know what you're doing. But if you mistakes, the consequences in our case may be more destructive.

Anyway, I personally prefer to use those container classes. I've been using them for a long time, and up until now no regrets.

Complex scenarios

Suppose you have a structure that you want to put in two containers. As we've seen with our server example, your structure should inherit both containers' nodes. But, what if you want to put it in two containers of the same type? For instance, our server could also maintain its client in one more linked list in which they're sorted by another criteria, say, by the last time the server sent something to them. So, once in a while, we can walk through this list (from tail to head) and send some keep-alive message to those clients. You can not write something like this:

// Error. You can't inherit the same class twice.

struct ClientInfo
    :public LinkList::Node    // sorted by client activity

    ,public LinkList::Node    // sorted by server activity

{
};

I've found two ways to solve this situation:

struct ClientLstActivityClient
    :public LinkList::Node {
};

struct ClientLstActivityServer
    :public LinkList::Node {
};

struct ClientInfo
    :ClientLstActivityClient
    ,ClientLstActivityServer
{
};

This way, we can convince the compiler to inherit the same struct twice. But now, you can't do this:

LinkListEx<ClientInfo> lstActivityClient;

// Error. Cast from LinkList::Node to ClientInfo is ambiguous.

ClientInfo* pVal = lstActivityClient.GetHead();

// Instead you'll have to write:

LinkListEx<ClientLstActivityClient> lstActivityClient;
ClientInfo* pVal = (ClientInfo*) lstActivityClient.GetHead(); // Correct

Again, you'll have to mess with castings. Fortunately, there's another method:

// This macro is handy when you have an object of some class
// as a member of another class. Knowing a pointer to the inner object
// and its offset within the outer one we can calculate the pointer to the outer.

#define GET_OUTER_OBJ(outer_class, this_var) \
    outer_class& GetOuter() { return * (outer_class*) \
        (((PBYTE) this) - (PBYTE) (&((outer_class*) NULL)->this_var)); } \
    outer_class* GetOuterSafe() { return this ? &GetOuter() : NULL; }

struct ClientInfo
{
    struct ActivityClient :public LinkList::Node {
        GET_OUTER_OBJ(ClientInfo, m_ActClient)
    } m_ActClient;

    struct ActivityServer :public LinkList::Node {
        GET_OUTER_OBJ(ClientInfo, m_ActServer)
    } m_ActServer;

    struct LookupID :public AvlTreeULong::Node {
        GET_OUTER_OBJ(ClientInfo, m_LookupID)
    } m_LookupID;

    // more members

    // ...

};

LinkListEx<ClientInfo::ActivityClient> lstActivityClient;
LinkListEx<ClientInfo::ActivityServer> lstActivityServer;
AvlTreeULong mapByID;

ClientInfo* pVal = lstActivityClient.GetHead()->GetOuterSafe(); // Correct

if (pVal)
{
    ClientInfo* pNextByServerActivity =
        lstActivityServer.GetNext(pVal->m_ActServer)->GetOuterSafe();
    // ...


    // At any time we see the client ID:

    ULONG nClientID = pVal->m_LookupID.m_Key;
}

// ...


pVal = mapByID.Find(445)->GetOuterSafe();
if (pval)
{
    // move it to the head in the client activity list

    lstActivityClient.Remove(pval->m_ActClient);
    lstActivityClient.InsertHead(pval->m_ActClient);
}

// ...

// Drop the client

mapByID.Remove(pVal->m_LookupID);
lstActivityClient.Remove(pVal->m_ActClient);
lstActivityServer.Remove(pVal->m_ActServer);
delete pVal;

// ...

// Create another client, insert it to all containers.

pVal = new ClientInfo;
lstActivityClient.InsertHead(pVal->m_ActClient);
lstActivityServer.InsertHead(pVal->m_ActServer);

pVal->m_LookpuID.m_Key = 535;
mapByID.Insert(pVal->m_LookupID);

Not so simple syntax. But our functionality is pretty sophisticated. I want to see you implement this with better syntax. After you get used to it, you just enjoy. Everything's transparent, you see exactly what you do.

Implementation details

If you are still reading this article, it means you've got interested about this approach, and maybe even thinking about using it. So, this section is for you.

I've implemented the following container classes:

Single/double direction linked list, with either or both ends.
Hash table map.
AVL Binary tree.

Every container class can optionally be equipped with a counter, which always equals to the count of objects it contains.

In order to implement this without duplicating the container code for a minor thing (like the counter), we have to follow this pattern:

First write the base container engine. Take only mandatory template arguments that define the basic functionality. For trees and hash tables, such a parameter is the lookup key type, and for the linked list, such parameters are single/double direction-ness, which ends to have.
Optionally inherit the base engine and add a counter.
Inherit and add type casts. This layer doesn't produce extra code.
Optionally inherit and add Create/Delete/Cleanup.

Linked lists

In order to use a linked list, you have to decide which variant of it you need. Once you've decided, you need to instantiate its engine. This is how it is declared:

template <bool Prev, bool Next, bool Head, bool Tail, class T_COUNT>
class ListEng;

Prev - specifies if you want a backward link direction.
Next - specifies if you want a forward link direction.
Head - specifies if you want a head pointer.
Tail - specifies if you want a tail pointer.
T_COUNT - the type you want the counter to be; usually, size_t or void (for no counter).

Theoretically, we have 2^5 variants, which is 32. But not all of them are valid. For example, there's no use for a list with neither Prev nor Next. Same for neither Head nor Tail. Also, you can't have a forward-only list without the head pointer. So, generally speaking, we have the following variants:

Links	Ends	Synonym	Container size	With counter	Node size
Forward	Head	Stack	4	8	4
Forward	Both	Queue	8	12	4
Both	Head	List/Stack	4	8	8
Both	Both	List	8	12	8

* All sizes are given for 32bit architecture.

Example: Let's say we want a single-direction list with one end, something like a stack, without the counter. Then, let's instantiate it:

typedef ListEng<false, true, true, false, void> MyStack;

Now, declare the structure you want to be contained in such a stack:

struct MyObj :public NodeEx<MyStack, MyObj> {
    // structure members.

};

This declaration is a bit different from what we saw earlier. Instead of inheriting MyStack::Node, we now inherit NodeEx<MyStack, MyObj>. Actually, it has the same effect, it just makes the usage a bit more convenient. It prevents direct access to the m_pNext/m_pPrev members. Instead, it gives you read/only GetNext()/GetPrev(). It also provides you with the cast.

Now, let's instantiate the final container for our structure. Let's say we want manual create/destroy.

typedef ListEx<MyStack, MyObj> MyList;

Now, use it. Note that our list has limited functionality, because it is single-direction and has only one ending.

// ...

MyList lst;

MyObj* pObj = new MyObj;
lst.InsertTail(*pObj);    // Error. We have no tail pointer.

lst.InsertHead(*pObj);    // Correct

pObj = GetTail();    // Error. We have no tail pointer.

pObj = GetHead();    // Correct

pObj = pObj->_Next;    // Correct

pObj = pObj->_Prev;    // Error. We have no backward link.

size_t nCount = lst._Count;    // Error. We have no counter.

bool bEmpty = lst._Empty;    // Correct. Can be used even without counter.

lst.Remove(*pObj);    // Error. This function requires bi-directional link.

lst.RemoveHead();    // Correct

lst.RemoveNext(*pObj);    // Correct.

// Even though the list is single-directional - you can remove any
// element if you have its prev.

Hash tables

Hash table is a data structure that allows fast lookup using large memory. At construction time, you need to specify the count of hash table entries. The size of the container structure is the number of hash table entries multiplied by the size of a pointer (4 bytes for 32bit architecture). The average lookup time is proportional to N (number of elements so far) divided by the count of hash table entries. Hence, by using large memory, we can reduce the lookup time to any level.

The lookup key doesn't have to be of ordinal type. It can't be any type, you just need to specify an effective hash function for it and the comparison operator. Say, if your key is of type MyLookupType, and ARG_KEY is const MyLookupType&, you must provide the following function:

template <>
size_t MapInlHashKey(const MyLookupType& key);

In fact, hash functions on real-life input never produce equally-distributed values, usually there's some pattern. This, in turn, affects the effectiveness of the search. Because of this, it is good to pick the size of the hash table to be a prime number. It will diffuse the coverage of the hash table.

The size of the hash table node is 4 bytes plus the size of the lookup key.

Note: One major difference of this hash table compared to MFC's is that it permits duplicated keys. You can insert several elements with the same key. Find() will return you the first of them. If your intent is to avoid duplication, before every insert, you should attempt to find an element with the same key. If it already exists, either don't insert the new element, or drop the old, depending on the logic of your program.

Similarly to lists, you have to instantiate the hash table engine first.

template<class KEY, class ARG_KEY, class T_COUNT>
class HashEng;

Let's say our key is an int, and we want the counter. Then we have:

typedef HashEng<int, int size_t> MyHashEng;

Now, declare our data structure:

struct MyObj :public MyHashEng::Node {
    // structure members.

};

Declare the final container: a hash table with 337 entries.

typedef HashEx<MyHashEng, 337, MyObj> MyHash;

Use it.

// ...

MyHash hash;

MyObj* pObj = new MyObj;
pObj->m_Key = 5499; // You must assign the key prior to insert.

hash.Inser(*pObj);

bool bEmpty = hash._Empty; // Error.

// This is not implemented for hash tables. Youse _Count instead.
size_t nCount = hash._Count;    // Correct

pObj = hash.Find(5499);

hash.Remove(*pObj);

delete pObj;

If you want to iterate through all the elements, it is very ineffective for hash tables. The larger the size of the hash table, the more ineffective the walkthrough is. That is, you shoot your leg.

That's also the reason there's no variant of the hash table with auto-cleanup at destructor; for cleanup, you must find all your members.

If you need such a functionality, you better arrange your elements in a linked list as well.

AVL trees

AVL tree is an auto-balanced binary search tree. Similar to the hash table, it can be used for fast search by some key, but it works totally differently. There're many differences between their properties. Let's list the main differences:

Trees don't pre-allocate extra memory. The size of an empty tree object is only 4 bytes (without counter).
A tree node, however, is significantly larger than a hash table node. Its size is 16 bytes (4 ordinal variables) plus the size of the key, whereas the hash table node size is only 4 bytes plus the size of the key. So we have a 12 bytes difference.
The search time in a tree grows logarithmically with N.
Trees offer effective walk through of all the elements.
Tree elements can be effectively iterated in the sorted order. You can quickly find the minimum/maximum; for every element, you can effectively get its nearest (in terms of key sort) neighbor.

So, trees have many advantages compared to hash tables. In most cases, it's better to use trees. Hash tables are only appropriate in the following scenarios:

You can predict approximately the count of elements it'll contain.
You can afford pre-allocating large memory to speed-up the search.
The search speed is so much critical that you can't tolerate Log(N).
Although you can afford pre-allocating extra memory for the hash table, you can't tolerate the extra 12 bytes per element the tree demands (unlikely).

Otherwise, it's better to use a tree.

One of the greatest advantages of trees is that you don't need to predict the count of elements for the tree, it'll always work in logarithmic time. Which is fast. For a thousand elements, you'll have to perform about 10 iterations, for million - 20, and for billion - 30.

One of the concerns about a hash table is that prediction of the elements count is very important. If you overestimate it, you'll grab bombastic memory without too much return. If you underestimate (by many times), the search may become very slow, much slower than what a tree offers.

It is possible to modify the hash table and make it adaptive. That is, whenever you detect that the size of the hash table is not adequate for the current elements count, silently rebuild it using another hash table size. This way, you'll get good overall performance without the need to predict the count of elements. The only drawback here is that upon silent rebuild, you'll get a performance spike.

Similar to hash tables, our trees permit duplicated keys (unlike STL trees). You can insert several elements with the same key. Find() will return you any of them.

First, you instantiate the tree engine, which is declared this way:

template <class KEY, class ARG_KEY, class T_COUNT>
class TreeEng;

Say, your key is an int (we'll discuss complex key types later), and you want the counter:

typedef TreeEng<int, int size_t> MyTreeEng;

Now, declare your structure:

struct MyObj :public MyTreeEng::Node {
    // structure members.

};

The final container. Let's say we want the Create/Delete/Auto-Cleanup feature.

typedef TreeDyn<MyTreeEng, MyObj> MyTree;

Use it now.

// ...

MyTree tree;

MyObj* pObj = new MyObj;
pObj->m_Key = 5499; // Set the key prior to insert

tree.Insert(*pObj);

// Since our tree can create elements - we can also write the following:
pObj = tree.Create(5423);

bool bEmpty = tree._Empty;

size_t nCount = hash._Count;

pObj = tree.Find(5499);
hash.Delete(pObj); // equivalent to Remove-And-Destroy.

// Let's demonstrate the extra tree functionality that hash tables don't have:
pObj = tree.FindMin(); // Returns the pObj whose m_Key is minimal.

pObj = tree.FindMax(); // Returns the pObj whose m_Key is maximal.

pObj = tree.FindBigger(7634); // smallest, still bigger than 7634

pObj = tree.FindExactSmaller(12); // biggest, still not exceeding bigger 12

pObj = tree.FindNext(*pObj); // closest, greater or equal to your key.

Let's discuss complex keys.

For any two keys, the tree must know if they're equal, and if they not - which is greater. For ordinal types, such a comparison exists. For complex types, however, you must provide it. You can do it in several ways:

// Define comparison operators:

bool operator == (const MyKey& k1, const MyKey& k2);
bool operator < (const MyKey& k1, const MyKey& k2);
bool operator > (const MyKey& k1, const MyKey& k2);

This method however is ineffective: Sometimes the tree will ask if two keys are equal. If not, it will ask if the first is greater than the second. This way, it'll need to compare those keys twice. For trivial comparisons, there's no problem, but if the comparison is complex and time-consuming, there should be a better way. Eventually, I redesigned the tree to call the following function:

int Compare(const MyKey& k1, const MyKey& k2;
// Return 0 if they're equal, -1 if k1 < k2, 1 if k1 > k2. (Like strcmp).

This way, you need to call this function only once for two keys in order to decide what to do with them. But eventually, I realized there exists an even better way.

Imagine your key is a string. You have a lot of elements, and all the strings are very long. Suppose you're requested now to find an element. Then, you start from the root element, compare its key with the given one; depending on the comparison result, you move to either the left or right child, and so on.

The catch here is that as you approach your element, the keys you compare become more and more closer to your key. After a couple of iterations, the first letter of the keys we pass will stop changing and stay equal to your key's. Eventually, the second letter will, and so on. As you advance the search, you get some prefix that will remain equal to your key's prefix. And, this means that in order to compare two keys, you only need to compare their suffixes. Not the whole strings.

That's why I've redesigned the key comparison so that instead of comparing keys via a function, we'll declare the comparator object. This object will be instantiated each time we begin ordered keys comparison. And this object will be able to use the comparison history and optimize future comparisons.

For simple types, we don't use the history.

template <class ARG_KEY>
struct TreeInlComparator {
    ARG_KEY m_Key;
    TreeInlComparator(ARG_KEY key)
        :m_Key(key)
    {
    }

    int Compare(ARG_KEY key)
    {
        return (key < m_Key) ? (-1) : (key > m_Key) ? 1 : 0;
    }
};

For strings, however, we make a more sophisticated comparator:

template <>
struct TreeInlComparator<const WString&> {

    const WCHAR* m_szTxt;
    size_t m_nLMatch;
    size_t m_nRMatch;

    TreeInlComparator_String(const WString& str)
        :m_szTxt(str)
        ,m_nLMatch(0)
        ,m_nRMatch(0)
    {
    }
    int Compare(const WString& str)
    {
        const WCHAR* szVal = str;
        for (size_t nPos = min(m_nLMatch, m_nRMatch); ; nPos++)
        {
            T chThis = m_szTxt[nPos];
            T chArg = szVal[nPos];
            if (chArg < chThis)
            {
                m_nLMatch = nPos;
                return -1;
            }
            if (chArg > chThis)
            {
                m_nRMatch = nPos;
                return 1;
            }
            if (!chThis)
                return 0;
        }
    }
};

Conclusion

Well, conclusion... I really like those container classes, I can hardly imagine programming with "standard" ones.

Some believe that the performance advantages are minor and can't be seen in real life. I deeply disagree with that. There is a huge difference between allocating one struct with all the needed members and allocating several small structs. For many applications, heap allocations in some critical code blocks are the performance bottleneck.

Consider even the example with our server: move the client in the activity list to the head. This is achieved by removing it from the list and re-inserting at the head. Our implementation does this without touching the heap at all. And, STL/MFC would delete their internal node at Remove, and allocate it at Insert.

I remember once I needed a map with some key, but without value. I mean, I needed only to know if that key existed, without storing any value for it. MFC can't offer you that. You just put some fictive type as VALUE and ARG_VALUE. STL does have such an option (set class), at the expense of the extra code generated.

And, our implementation easily handles this: you just override MyHashEng::Node, without adding any new members to it. You can even not override MyHashEng::Node at all, use it as your contained type.

What I like about linked lists is that they include only what you want them to have. Nothing unnecessary. For instance, inside the hash table implementations, I treat the hash table itself not as an array of pointers, but as an array of single-direction linked lists with one ending. It has the size of only 4 bytes, just like a regular pointer (because it just contains a single pointer).

One more advantage is that when you allocate a single data struct instead of several ones, it's easier to handle out-of-memory conditions. The structure is either allocated or not. And if it is, it can be inserted into all the containers, hence your data structures are always consistent. If you work with STL/MFC containers, it is possible that you allocate your structure, add it to one of the containers, but then eventually you'll get out-of-memory on an attempt to insert it into the second container. If you use a sort of a uniform error handling, you have a problem: at which stage did you get the error condition? Did you change your data structures? Are they consistent now? You have to invent some dirty method to know what you already did to your containers.

And, of course, we got rid of the iterators. No more iterators. The object is an iterator; once you have it, you can do whatever you want to it.

Syntax. Well, not quite simple, I admit. But do you think STL syntax is simpler? I doubt so. Our syntax becomes complicated when we do complicated things. For simple cases, and even not-so-simple ones, we have good syntax, not worse than what other containers have.

As I've already said, our containers put a bit more responsibility on us. You must know precisely what you're doing. Is this a disadvantage? Some believe so... Well, if you don't want the "responsibility", write programs in higher-level languages.

I'll appreciate comments. Both positive and negative. If you have some ideas to share, you're welcome.

Updates

Since this article was posted, I've updated the following:

Made the article fit the normal browser width (thanks to FatGeek).
Changed the code of the TreeEng class. Most of the code there is about tree balancing (so-called rotations) upon its change, and this part doesn't depend on the KEY template argument. Hence, I've extracted it into another non-template class. This way, we shrink the size of the code.
Made the implementation compile at warning level 4. Some warnings however are still there :)
PhoneBook demo application. Stores phonebook entries by some name, every entry has several numbers. You can search by both the entry name and the number. Also maintains the MRU entries/numbers.
A test for the tree. Its implementation may look complicated; however, it works correctly.