C++ Memory Pool

DanDanger2000

3.55/5 (34 votes)

Sep 8, 2006

CPOL

8 min read

260127

11309

A (simple) C++ memory pool.

Console output (ScreenShot)

Introduction
How does it work?
Example
Using the code
Points of interest
Speed measurement
About the code...
ToDo
History

Introduction

Memory allocation in C/C++ (via malloc or new) can take a lot of time.

To make things worse, the memory will fragment over time, so an application's performance will decrease when it is running over a long period of time and/or performing a lot of memory (de-)allocations. Especially if you are allocating very small amounts of memory very often, the heap becomes fragmented.

The solution: your own Memory Pool

A (possible) solution can be a Memory Pool.

A "Memory Pool" allocates a big amount of memory on startup, and will separate this block into smaller chunks. Every time you request memory from the pool, it is taken from the previously allocated chunks, and not from the OS. The biggest advantages are:

Very little (to none) heap-fragmentation
Faster than the "normal" memory (de-)allocation (e.g., via malloc, new, etc.)

Additionally, you get these benefits:

Check if an arbitrary pointer is inside the Memory Pool
Write a "Heap-Dump" to your HDD (great for post-mortem debugging, etc.)
Some kind of "memory-leak detection": the Memory Pool will throw an assertion when you have not freed all previously allocated memory

How does it work?

Let's have a look at the Memory Pool UML schema:

MemoryPool UML schema

Memory Pool UML schema

This schema shows only a small section of the CMemoryPool class, have a look at the Doxygen-generated documentation for a detailed class-description.

So, how does it actually work?

A word about the MemoryChunks

As you can see from the UML schema, the Memory Pool administrates a pointer to SMemoryChunk structs (m_ptrFirstChunk, m_ptrLastChunk, and m_ptrCursorChunk). These chunks establish a linked list of memory chunks. Each points to the next chunk in the list. When a memory block gets allocated from the OS, it will be entirely managed by SMemoryChunks. Let's have a closer look at such a chunk:

typedef struct SMemoryChunk
{
  TByte *Data ;             // The actual Data
  std::size_t DataSize ;    // Size of the "Data"-Block
  std::size_t UsedSize ;    // actual used Size
  bool IsAllocationChunk ;  // true, when this MemoryChunks
                            // Points to a "Data"-Block
                            // which can be deallocated via "free()"
  SMemoryChunk *Next ;      // Pointer to the Next MemoryChunk
                            // in the List (may be NULL)

} SMemoryChunk ;

Each chunks holds a pointer to:

a small piece of memory (Data),
the total size of memory available starting from that chunk (DataSize),
the actual used size (UsedSize),
and a pointer to the next chunk in the List (Next).

The first step: pre-allocating the memory

When you call the CMemoryPool constructor, the Memory Pool will allocate its first (big) memory-chunk from the OS.

/******************
Constructor
******************/
CMemoryPool::CMemoryPool(const std::size_t &sInitialMemoryPoolSize,
                         const std::size_t &sMemoryChunkSize,
                         const std::size_t &sMinimalMemorySizeToAllocate,
                         bool bSetMemoryData)
{
  m_ptrFirstChunk  = NULL ;
  m_ptrLastChunk   = NULL ;
  m_ptrCursorChunk = NULL ;

  m_sTotalMemoryPoolSize = 0 ;
  m_sUsedMemoryPoolSize  = 0 ;
  m_sFreeMemoryPoolSize  = 0 ;

  m_sMemoryChunkSize   = sMemoryChunkSize ;
  m_uiMemoryChunkCount = 0 ;
  m_uiObjectCount      = 0 ;

  m_bSetMemoryData               = bSetMemoryData ;
  m_sMinimalMemorySizeToAllocate = sMinimalMemorySizeToAllocate ;

  // Allocate the Initial amount of Memory from the Operating-System...
  AllocateMemory(sInitialMemoryPoolSize) ;
}

The usual initialization of all class members is done here, and AllocateMemory will finally request the memory from the OS.

/******************
AllocateMemory
******************/
<CODE>bool CMemoryPool::AllocateMemory(const std::size_t &sMemorySize)
{
  std::size_t sBestMemBlockSize = CalculateBestMemoryBlockSize(sMemorySize) ;
  // allocate from Operating System
  TByte *ptrNewMemBlock = (TByte *) malloc(sBestMemBlockSize) ;
  ...

So, how does the Memory Pool manage this data?

The second step: segmentation of allocated memory

As mentioned earlier, the Memory Pool manages all data in SMemoryChunks. After requesting memory from the OS, there is no connection between our chunks and the actual memory block yet:

MemoryPool after inital allocation

Memory Pool after initial allocation

We need to allocate an array of SMemoryChunk structs to manage that memory block:

//(AllocateMemory() continued) : 
  ...
  unsigned int uiNeededChunks = CalculateNeededChunks(sMemorySize) ;
  // allocate Chunk-Array to Manage the Memory
  SMemoryChunk *ptrNewChunks = 
    (SMemoryChunk *) malloc((uiNeededChunks * sizeof(SMemoryChunk))) ;
  assert(((ptrNewMemBlock) && (ptrNewChunks)) 
                           && "Error : System ran out of Memory") ;
  ...

CalculateNeededChunks() will calculate the number of chunks needed to manage the given amount of memory. After allocating the chunks (via malloc), ptrNewChunks points to an array of SMemoryChunks. Note, that the chunks in the array are holding only junk-data, since we have not assigned the chunk-members to any meaningful data. The Memory Pool-"Heap" will look like this:

After SMemoryChunk allocation

Memory Pool after SMemoryChunk allocation

Still, there is no connection between the data block and the chunks. But AllocateMemory() will take care of it now. LinkChunksToData() will finally link the memory block and the chunks together. This will also assign valid data to each chunk-member...

//(AllocateMemory() continued) : 
  ...
  // Associate the allocated Memory-Block with the Linked-List of MemoryChunks
  return LinkChunksToData(ptrNewChunks, uiNeededChunks, ptrNewMemBlock) ;

Let's have a closer look at LinkChunksToData():

/******************
LinkChunksToData
******************/
bool CMemoryPool::LinkChunksToData(SMemoryChunk *ptrNewChunks, 
     unsigned int uiChunkCount, TByte *ptrNewMemBlock)
{
  SMemoryChunk *ptrNewChunk = NULL ;
  unsigned int uiMemOffSet = 0 ;
  bool bAllocationChunkAssigned = false ;
  for(unsigned int i = 0; i < uiChunkCount; i++)
  {
    if(!m_ptrFirstChunk)
    {
      m_ptrFirstChunk = SetChunkDefaults(&(ptrNewChunks[0])) ;
      m_ptrLastChunk = m_ptrFirstChunk ;
      m_ptrCursorChunk = m_ptrFirstChunk ;
    }
    else
    {
      ptrNewChunk = SetChunkDefaults(&(ptrNewChunks[i])) ;
      m_ptrLastChunk->Next = ptrNewChunk ;
      m_ptrLastChunk = ptrNewChunk ;
    }
    
    uiMemOffSet = (i * ((unsigned int) m_sMemoryChunkSize)) ;
    m_ptrLastChunk->Data = &(ptrNewMemBlock[uiMemOffSet]) ;

    // The first Chunk assigned to the new Memory-Block will be 
    // a "AllocationChunk". This means, this Chunks stores the
    // "original" Pointer to the MemBlock and is responsible for
    // "free()"ing the Memory later....
    if(!bAllocationChunkAssigned)
    {
      m_ptrLastChunk->IsAllocationChunk = true ;
      bAllocationChunkAssigned = true ;
    }
  }
  return RecalcChunkMemorySize(m_ptrFirstChunk, m_uiMemoryChunkCount) ;
}

Let's have a step-by-step look at this important method: the first lines check if there are already chunks available in the list:

  ...
  if(!m_ptrFirstChunk)
  ...

In the beginning, this is not the case. So we assign our internal class-members for the first time:

  ...
  m_ptrFirstChunk = SetChunkDefaults(&(ptrNewChunks[0])) ;
  m_ptrLastChunk = m_ptrFirstChunk ;
  m_ptrCursorChunk = m_ptrFirstChunk ;
  ...

m_ptrFirstChunk points now to the first chunk in the chunks-array. Every chunk manages exactly m_sMemoryChunkSize bytes from the memory block. An "offset"-value will be calculated so that each chunk will point to a specific portion of the memory block.

  uiMemOffSet = (i * ((unsigned int) m_sMemoryChunkSize)) ;
  m_ptrLastChunk->Data = &(ptrNewMemBlock[uiMemOffSet]) ;

Additionally, every new SMemoryChunk from the array will be added to the last element of the linked list (and will finally become the last element itself):

  ...
  m_ptrLastChunk->Next = ptrNewChunk ;
  m_ptrLastChunk = ptrNewChunk ;
  ...

In the subsequent "for loop" passes, the memory pool will successively assign valid data to all chunks in the array.

Memory and chunks linked togehter

Memory and chunks linked together, pointing to valid data

Finally, we have to recalculate the total amount of memory each chunk can manage. This is quite time-consuming, and must be done every time new memory from the OS is added to the memory pool. The total amount of memory for each chunk will be assigned to the DataSize member of the chunk.

/******************
RecalcChunkMemorySize
******************/
bool CMemoryPool::RecalcChunkMemorySize(SMemoryChunk *ptrChunk, 
                  unsigned int uiChunkCount)
{
  unsigned int uiMemOffSet = 0 ;
  for(unsigned int i = 0; i < uiChunkCount; i++)
  {
    if(ptrChunk)
    {
      uiMemOffSet = (i * ((unsigned int) m_sMemoryChunkSize)) ;
      ptrChunk->DataSize = 
        (((unsigned int) m_sTotalMemoryPoolSize) - uiMemOffSet) ;
      ptrChunk = ptrChunk->Next ;
    }
    else
    {
     assert(false && "Error : ptrChunk == NULL") ;
     return false ;
    }
  }
  return true ;
}

After RecalcChunkMemorySize, every chunk knows the amount of free memory it is pointing at. So, it becomes very easy to determine if a chunk is able to hold a specific amount of memory: when the DataSize member is greater (or equal) than the requested amount of memory and the UsedSize member is 0, then the chunk is capable of holding the memory. Finally, the memory segmentation is finished. To make things less abstract, let's assume that the memory pool contains 600 bytes and every chunk is holding 100 bytes.

Memory-Segmentation finished

Memory segmentation finished. Each chunk manages exactly 100 bytes

The third step: requesting memory from the memory pool

So, what happens if the user requests memory from the pool? Initially, all data in the memory pool is freely available:

All Memory available

All memory blocks are available

Let's have a look at GetMemory:

/******************
GetMemory
******************/
void *CMemoryPool::GetMemory(const std::size_t &sMemorySize)
{
  std::size_t sBestMemBlockSize = 
    CalculateBestMemoryBlockSize(sMemorySize) ;  
  SMemoryChunk *ptrChunk = NULL ;
  while(!ptrChunk)
  {
    // Is a Chunks available to hold the requested amount of Memory ?
    ptrChunk = FindChunkSuitableToHoldMemory(sBestMemBlockSize) ;
    <CODE>if(!ptrChunk)
    {
      // No chunk can be found
      // => Memory-Pool is to small. We have to request 
      //    more Memory from the Operating-System....
      sBestMemBlockSize = MaxValue(sBestMemBlockSize, 
        CalculateBestMemoryBlockSize(m_sMinimalMemorySizeToAllocate)) ;
      AllocateMemory(sBestMemBlockSize) ;
    }
  }

  // Finally, a suitable Chunk was found.
  // Adjust the Values of the internal "TotalSize"/"UsedSize" Members and 
  // the Values of the MemoryChunk itself.
  m_sUsedMemoryPoolSize += sBestMemBlockSize ;
  m_sFreeMemoryPoolSize -= sBestMemBlockSize ;
  m_uiObjectCount++ ;
  SetMemoryChunkValues(ptrChunk, sBestMemBlockSize) ;

  // eventually, return the Pointer to the User
  return ((void *) ptrChunk->Data) ;
}

When the user requests memory from the memory pool, it will search the list for a chunk which is capable of holding the requested amount. That means:

the DataSize of that chunk must be greater or equal to the requested amount of memory
the UsedSize of the chunk must be 0

This is done by the FindChunkSuitableToHoldMemory method. If it returns NULL, then no memory is available in the memory pool. This will lead to a AllocateMemory call (discussed above), which will request more memory from the OS. If the return value is not NULL, a valid chunk was found. SetMemoryChunkValues will adjust the internal values of the chunk, and finally the Data pointer is returned to the user...

/******************
SetMemoryChunkValues
******************/
void CMemoryPool::SetMemoryChunkValues(SMemoryChunk *ptrChunk, 
     const std::size_t &sMemBlockSize)
{
  if(ptrChunk) 
  {
    ptrChunk->UsedSize = sMemBlockSize ;
  }
  ...
}

Example

Let's assume, that the user requested 250 bytes from our memory pool:

Memory in use

Memory in use

As you can see, each memory chunk manages 100 bytes, so 250 bytes just don’t fit in there. What will happen? Well, GetMemory will return the Data pointer from the first chunk and sets its UsedSize member to 300 bytes, since 300 bytes is the smallest amount of memory which can be managed by the chunks and is >= 250 bytes. The (300 - 250 = 50) bytes left are the so called "memory overhead" of the memory pool (in addition to the the memory needed by the chunks themselves). This is not as bad as it seems, because the memory can be still used (it is still in the memory pool).

When FindChunkSuitableToHoldMemory searches for valid chunks, it will jump only from an "empty-chunk" to another "empty-chunk". That means, if someone requests another memory-chunk, the fourth chunk in the example (the one holding 300 bytes) will be the next "valid" chunk.

Jump to next valid chunk

Jump to next valid chunk

Using the code

Using the code is simple and straightforward: just include "CMemoryPool.h" in your application, and add all related files to your IDE/Makefile:

CMemoryPool.h
CMemoryPool.cpp
IMemoryBlock.h
SMemoryChunk.h

You just have to create an instance of the CMemoryPool class, and you can start allocating memory from it. All memory pool configuration is done in the CMemoryPool constructor (by optional parameters). Have a look at the header file ("CMemoryPool.h") or the Doxygen-doku. All files are heavily (Doxygen-)documented.

Usage example

MemPool::CMemoryPool *g_ptrMemPool = new MemPool::CMemoryPool() ;

char *ptrCharArray = (char *) g_ptrMemPool->GetMemory(100) ;
...
g_ptrMemPool->FreeMemory(ptrCharArray, 100) ;

delete g_ptrMemPool ;

Points of interest

Memory dump

You can write a "memory dump" to your HDD at any time via WriteMemoryDumpToFile(strFileName). Let's look at a the constructor of a simple test-class (with overloaded new and delete operators using the memory pool):

/******************
Constructor
******************/
MyTestClass::MyTestClass()
{
   m_cMyArray[0] = 'H' ;
   m_cMyArray[1] = 'e' ;
   m_cMyArray[2] = 'l' ;
   m_cMyArray[3] = 'l' ;
   m_cMyArray[4] = 'o' ;
   m_cMyArray[5] = NULL ;
   m_strMyString = "This is a small Test-String" ;
   m_iMyInt = 12345 ;

   m_fFloatValue = 23456.7890f ;
   m_fDoubleValue = 6789.012345 ;

   Next = this ;
}

MyTestClass *ptrTestClass = new MyTestClass ; 
g_ptrMemPool->WriteMemoryDumpToFile("MemoryDump.bin") ;

Have a look at the memory dump-file ("MemoryDump.bin"):

As you can see, these are all values from the MyTestClass class-members in the memory dump. Obviously, the "Hello"-string (m_cMyArray) is there, so also the integer value m_iMyInt (3930 0000 = 0x3039 = 12345 decimal) etc.. This is great for post-mortem debugging...

Speed measurement

I have done some very simple speed-testing on Windows (via timeGetTime()), but the results should show that the memory pool can greatly speed up the application. All tests have been made with Microsoft Visual Studio .NET 2003 in debug build (test-machine: Intel Pentium IV Processor (32 bit), 1GB RAM, MS Windows XP Professional).

//Array-test (Memory Pool): 
for(unsigned int j = 0; j < TestCount; j++)
{
    // ArraySize = 1000
    char *ptrArray = (char *) g_ptrMemPool->GetMemory(ArraySize)  ;
    g_ptrMemPool->FreeMemory(ptrArray, ArraySize) ;
}
  
//Array-test (Heap): 
for(unsigned int j = 0; j < TestCount; j++)
{
    // ArraySize = 1000
    char *ptrArray = (char *) malloc(ArraySize)  ;
    free(ptrArray) ;
}

Speed test Results for the array-test

Results for the "array-test"

//Class-Test for MemoryPool and Heap (overloaded new/delete)
for(unsigned int j = 0; j < TestCount; j++)
{
    MyTestClass *ptrTestClass = new MyTestClass ;
    delete ptrTestClass ;
}

Speed test Results for the classes-test

Results for the "classes-test" (overloaded new/delete operators)

About the code...

The code has been tested under MS Windows and Linux with the following C++ compiler(s):

Microsoft Visual C++ 6.0
Microsoft Visual C++ .NET 2003
MinGW (GCC) 3.4.4 (Windows)
GCC 4.0.X (Debian GNU Linux)

Project files for Microsoft Visual C++ 6.0 (*.dsw, *.dsp) and Microsoft Visual C++ .NET 2003 (*.sln, *.vcproj) are included in the download. This memory pool uses only ANSI/ISO C++, so it should compile on any OS with a decent C++ compiler. There should be no problem using it on a 64-bit processor.

Note: The memory pool is not thread-safe!

ToDo

This memory pool is far from being perfect ;-) The ToDo-list includes:

For huge amounts of memory, the memory-"overhead" can be quite large.
Some CalculateNeededChunks calls can be stripped off by redesigning some methods => Even more speed up. ;-)
More stability tests (especially for very long-running applications).
Make it thread-safe.

History

05.09.2006: Initial release.

EoF