Contents
Memory allocation in C/C++ (via malloc
or new
) can take a lot of time.
To make things worse, the memory will fragment over time, so an application's performance will decrease when it is running over a long period of time and/or performing a lot of memory (de-)allocations. Especially if you are allocating very small amounts of memory very often, the heap becomes fragmented.
The solution: your own Memory Pool
A (possible) solution can be a Memory Pool.
A "Memory Pool" allocates a big amount of memory on startup, and will separate this block into smaller chunks. Every time you request memory from the pool, it is taken from the previously allocated chunks, and not from the OS. The biggest advantages are:
- Very little (to none) heap-fragmentation
- Faster than the "normal" memory (de-)allocation (e.g., via
malloc
, new
, etc.)
Additionally, you get these benefits:
- Check if an arbitrary pointer is inside the Memory Pool
- Write a "Heap-Dump" to your HDD (great for post-mortem debugging, etc.)
- Some kind of "memory-leak detection": the Memory Pool will throw an assertion when you have not freed all previously allocated memory
Let's have a look at the Memory Pool UML schema:
Memory Pool UML schema
This schema shows only a small section of the CMemoryPool
class, have a look at the Doxygen-generated documentation for a detailed class-description.
So, how does it actually work?
A word about the MemoryChunks
As you can see from the UML schema, the Memory Pool administrates a pointer to SMemoryChunk
structs (m_ptrFirstChunk
, m_ptrLastChunk
, and m_ptrCursorChunk
). These chunks establish a linked list of memory chunks. Each points to the next chunk in the list. When a memory block gets allocated from the OS, it will be entirely managed by SMemoryChunk
s. Let's have a closer look at such a chunk:
typedef struct SMemoryChunk
{
TByte *Data ;
std::size_t DataSize ;
std::size_t UsedSize ;
bool IsAllocationChunk ;
SMemoryChunk *Next ;
} SMemoryChunk ;
Each chunks holds a pointer to:
- a small piece of memory (
Data
),
- the total size of memory available starting from that chunk (
DataSize
),
- the actual used size (
UsedSize
),
- and a pointer to the next chunk in the List (
Next
).
The first step: pre-allocating the memory
When you call the CMemoryPool
constructor, the Memory Pool will allocate its first (big) memory-chunk from the OS.
CMemoryPool::CMemoryPool(const std::size_t &sInitialMemoryPoolSize,
const std::size_t &sMemoryChunkSize,
const std::size_t &sMinimalMemorySizeToAllocate,
bool bSetMemoryData)
{
m_ptrFirstChunk = NULL ;
m_ptrLastChunk = NULL ;
m_ptrCursorChunk = NULL ;
m_sTotalMemoryPoolSize = 0 ;
m_sUsedMemoryPoolSize = 0 ;
m_sFreeMemoryPoolSize = 0 ;
m_sMemoryChunkSize = sMemoryChunkSize ;
m_uiMemoryChunkCount = 0 ;
m_uiObjectCount = 0 ;
m_bSetMemoryData = bSetMemoryData ;
m_sMinimalMemorySizeToAllocate = sMinimalMemorySizeToAllocate ;
AllocateMemory(sInitialMemoryPoolSize) ;
}
The usual initialization of all class members is done here, and AllocateMemory
will finally request the memory from the OS.
<CODE>bool</CODE> CMemoryPool::AllocateMemory(const std::size_t &sMemorySize)
{
std::size_t sBestMemBlockSize = CalculateBestMemoryBlockSize(sMemorySize) ;
TByte *ptrNewMemBlock = (TByte *) malloc(sBestMemBlockSize) ;
...
So, how does the Memory Pool manage this data?
The second step: segmentation of allocated memory
As mentioned earlier, the Memory Pool manages all data in SMemoryChunk
s. After requesting memory from the OS, there is no connection between our chunks and the actual memory block yet:
Memory Pool after initial allocation
We need to allocate an array of SMemoryChunk
structs to manage that memory block:
...
unsigned int uiNeededChunks = CalculateNeededChunks(sMemorySize) ;
SMemoryChunk *ptrNewChunks =
(SMemoryChunk *) malloc((uiNeededChunks * sizeof(SMemoryChunk))) ;
assert(((ptrNewMemBlock) && (ptrNewChunks))
&& "Error : System ran out of Memory") ;
...
CalculateNeededChunks()
will calculate the number of chunks needed to manage the given amount of memory. After allocating the chunks (via malloc
), ptrNewChunks
points to an array of SMemoryChunk
s. Note, that the chunks in the array are holding only junk-data, since we have not assigned the chunk-members to any meaningful data. The Memory Pool-"Heap" will look like this:
Memory Pool after SMemoryChunk
allocation
Still, there is no connection between the data block and the chunks. But AllocateMemory()
will take care of it now. LinkChunksToData()
will finally link the memory block and the chunks together. This will also assign valid data to each chunk-member...
...
return LinkChunksToData(ptrNewChunks, uiNeededChunks, ptrNewMemBlock) ;
Let's have a closer look at LinkChunksToData()
:
bool CMemoryPool::LinkChunksToData(SMemoryChunk *ptrNewChunks,
unsigned int uiChunkCount, TByte *ptrNewMemBlock)
{
SMemoryChunk *ptrNewChunk = NULL ;
unsigned int uiMemOffSet = 0 ;
bool bAllocationChunkAssigned = false ;
for(unsigned int i = 0; i < uiChunkCount; i++)
{
if(!m_ptrFirstChunk)
{
m_ptrFirstChunk = SetChunkDefaults(&(ptrNewChunks[0])) ;
m_ptrLastChunk = m_ptrFirstChunk ;
m_ptrCursorChunk = m_ptrFirstChunk ;
}
else
{
ptrNewChunk = SetChunkDefaults(&(ptrNewChunks[i])) ;
m_ptrLastChunk->Next = ptrNewChunk ;
m_ptrLastChunk = ptrNewChunk ;
}
uiMemOffSet = (i * ((unsigned int) m_sMemoryChunkSize)) ;
m_ptrLastChunk->Data = &(ptrNewMemBlock[uiMemOffSet]) ;
if(!bAllocationChunkAssigned)
{
m_ptrLastChunk->IsAllocationChunk = true ;
bAllocationChunkAssigned = true ;
}
}
return RecalcChunkMemorySize(m_ptrFirstChunk, m_uiMemoryChunkCount) ;
}
Let's have a step-by-step look at this important method: the first lines check if there are already chunks available in the list:
...
if(!m_ptrFirstChunk)
...
In the beginning, this is not the case. So we assign our internal class-members for the first time:
...
m_ptrFirstChunk = SetChunkDefaults(&(ptrNewChunks[0])) ;
m_ptrLastChunk = m_ptrFirstChunk ;
m_ptrCursorChunk = m_ptrFirstChunk ;
...
m_ptrFirstChunk
points now to the first chunk in the chunks-array. Every chunk manages exactly m_sMemoryChunkSize
bytes from the memory block. An "offset"-value will be calculated so that each chunk will point to a specific portion of the memory block.
uiMemOffSet = (i * ((unsigned int) m_sMemoryChunkSize)) ;
m_ptrLastChunk->Data = &(ptrNewMemBlock[uiMemOffSet]) ;
Additionally, every new SMemoryChunk
from the array will be added to the last element of the linked list (and will finally become the last element itself):
...
m_ptrLastChunk->Next = ptrNewChunk ;
m_ptrLastChunk = ptrNewChunk ;
...
In the subsequent "for
loop" passes, the memory pool will successively assign valid data to all chunks in the array.
Memory and chunks linked together, pointing to valid data
Finally, we have to recalculate the total amount of memory each chunk can manage. This is quite time-consuming, and must be done every time new memory from the OS is added to the memory pool. The total amount of memory for each chunk will be assigned to the DataSize
member of the chunk.
bool CMemoryPool::RecalcChunkMemorySize(SMemoryChunk *ptrChunk,
unsigned int uiChunkCount)
{
unsigned int uiMemOffSet = 0 ;
for(unsigned int i = 0; i < uiChunkCount; i++)
{
if(ptrChunk)
{
uiMemOffSet = (i * ((unsigned int) m_sMemoryChunkSize)) ;
ptrChunk->DataSize =
(((unsigned int) m_sTotalMemoryPoolSize) - uiMemOffSet) ;
ptrChunk = ptrChunk->Next ;
}
else
{
assert(false && "Error : ptrChunk == NULL") ;
return false ;
}
}
return true ;
}
After RecalcChunkMemorySize
, every chunk knows the amount of free memory it is pointing at. So, it becomes very easy to determine if a chunk is able to hold a specific amount of memory: when the DataSize
member is greater (or equal) than the requested amount of memory and the UsedSize
member is 0, then the chunk is capable of holding the memory. Finally, the memory segmentation is finished. To make things less abstract, let's assume that the memory pool contains 600 bytes and every chunk is holding 100 bytes.
Memory segmentation finished. Each chunk manages exactly 100 bytes
The third step: requesting memory from the memory pool
So, what happens if the user requests memory from the pool? Initially, all data in the memory pool is freely available:
All memory blocks are available
Let's have a look at GetMemory
:
void *CMemoryPool::GetMemory(const std::size_t &sMemorySize)
{
std::size_t sBestMemBlockSize =
CalculateBestMemoryBlockSize(sMemorySize) ;
SMemoryChunk *ptrChunk = NULL ;
while(!ptrChunk)
{
ptrChunk = FindChunkSuitableToHoldMemory(sBestMemBlockSize) ;
<CODE>if</CODE>(!ptrChunk)
{
sBestMemBlockSize = MaxValue(sBestMemBlockSize,
CalculateBestMemoryBlockSize(m_sMinimalMemorySizeToAllocate)) ;
AllocateMemory(sBestMemBlockSize) ;
}
}
m_sUsedMemoryPoolSize += sBestMemBlockSize ;
m_sFreeMemoryPoolSize -= sBestMemBlockSize ;
m_uiObjectCount++ ;
SetMemoryChunkValues(ptrChunk, sBestMemBlockSize) ;
return ((void *) ptrChunk->Data) ;
}
When the user requests memory from the memory pool, it will search the list for a chunk which is capable of holding the requested amount. That means:
- the
DataSize
of that chunk must be greater or equal to the requested amount of memory
- the
UsedSize
of the chunk must be 0
This is done by the FindChunkSuitableToHoldMemory
method. If it returns NULL
, then no memory is available in the memory pool. This will lead to a AllocateMemory
call (discussed above), which will request more memory from the OS. If the return value is not NULL
, a valid chunk was found. SetMemoryChunkValues
will adjust the internal values of the chunk, and finally the Data
pointer is returned to the user...
void CMemoryPool::SetMemoryChunkValues(SMemoryChunk *ptrChunk,
const std::size_t &sMemBlockSize)
{
if(ptrChunk)
{
ptrChunk->UsedSize = sMemBlockSize ;
}
...
}
Let's assume, that the user requested 250 bytes from our memory pool:
Memory in use
As you can see, each memory chunk manages 100 bytes, so 250 bytes just don’t fit in there. What will happen? Well, GetMemory
will return the Data
pointer from the first chunk and sets its UsedSize
member to 300 bytes, since 300 bytes is the smallest amount of memory which can be managed by the chunks and is >= 250 bytes. The (300 - 250 = 50) bytes left are the so called "memory overhead" of the memory pool (in addition to the the memory needed by the chunks themselves). This is not as bad as it seems, because the memory can be still used (it is still in the memory pool).
When FindChunkSuitableToHoldMemory
searches for valid chunks, it will jump only from an "empty-chunk" to another "empty-chunk". That means, if someone requests another memory-chunk, the fourth chunk in the example (the one holding 300 bytes) will be the next "valid" chunk.
Jump to next valid chunk
Using the code is simple and straightforward: just include "CMemoryPool.h" in your application, and add all related files to your IDE/Makefile:
- CMemoryPool.h
- CMemoryPool.cpp
- IMemoryBlock.h
- SMemoryChunk.h
You just have to create an instance of the CMemoryPool
class, and you can start allocating memory from it. All memory pool configuration is done in the CMemoryPool
constructor (by optional parameters). Have a look at the header file ("CMemoryPool.h") or the Doxygen-doku. All files are heavily (Doxygen-)documented.
Usage example
MemPool::CMemoryPool *g_ptrMemPool = new MemPool::CMemoryPool() ;
char *ptrCharArray = (char *) g_ptrMemPool->GetMemory(100) ;
...
g_ptrMemPool->FreeMemory(ptrCharArray, 100) ;
delete g_ptrMemPool ;
Memory dump
You can write a "memory dump" to your HDD at any time via WriteMemoryDumpToFile(strFileName)
. Let's look at a the constructor of a simple test-class (with overloaded new
and delete
operators using the memory pool):
MyTestClass::MyTestClass()
{
m_cMyArray[0] = 'H' ;
m_cMyArray[1] = 'e' ;
m_cMyArray[2] = 'l' ;
m_cMyArray[3] = 'l' ;
m_cMyArray[4] = 'o' ;
m_cMyArray[5] = NULL ;
m_strMyString = "This is a small Test-String" ;
m_iMyInt = 12345 ;
m_fFloatValue = 23456.7890f ;
m_fDoubleValue = 6789.012345 ;
Next = this ;
}
MyTestClass *ptrTestClass = new MyTestClass ;
g_ptrMemPool->WriteMemoryDumpToFile("MemoryDump.bin") ;
Have a look at the memory dump-file ("MemoryDump.bin"):
As you can see, these are all values from the MyTestClass
class-members in the memory dump. Obviously, the "Hello"-string (m_cMyArray
) is there, so also the integer value m_iMyInt
(3930 0000 = 0x3039 = 12345 decimal) etc.. This is great for post-mortem debugging...
I have done some very simple speed-testing on Windows (via timeGetTime()
), but the results should show that the memory pool can greatly speed up the application. All tests have been made with Microsoft Visual Studio .NET 2003 in debug build (test-machine: Intel Pentium IV Processor (32 bit), 1GB RAM, MS Windows XP Professional).
for(unsigned int j = 0; j < TestCount; j++)
{
char *ptrArray = (char *) g_ptrMemPool->GetMemory(ArraySize) ;
g_ptrMemPool->FreeMemory(ptrArray, ArraySize) ;
}
for(unsigned int j = 0; j < TestCount; j++)
{
char *ptrArray = (char *) malloc(ArraySize) ;
free(ptrArray) ;
}
Results for the "array-test"
for(unsigned int j = 0; j < TestCount; j++)
{
MyTestClass *ptrTestClass = new MyTestClass ;
delete ptrTestClass ;
}
Results for the "classes-test" (overloaded new
/delete
operators)
The code has been tested under MS Windows and Linux with the following C++ compiler(s):
- Microsoft Visual C++ 6.0
- Microsoft Visual C++ .NET 2003
- MinGW (GCC) 3.4.4 (Windows)
- GCC 4.0.X (Debian GNU Linux)
Project files for Microsoft Visual C++ 6.0 (*.dsw, *.dsp) and Microsoft Visual C++ .NET 2003 (*.sln, *.vcproj) are included in the download. This memory pool uses only ANSI/ISO C++, so it should compile on any OS with a decent C++ compiler. There should be no problem using it on a 64-bit processor.
Note: The memory pool is not thread-safe!
This memory pool is far from being perfect ;-) The ToDo-list includes:
- For huge amounts of memory, the memory-"overhead" can be quite large.
- Some
CalculateNeededChunks
calls can be stripped off by redesigning some methods => Even more speed up. ;-)
- More stability tests (especially for very long-running applications).
- Make it thread-safe.
- 05.09.2006: Initial release.
EoF