One way of avoiding memory fragmentation using templates and vector to implement memory pool.
Working on a server application that is expected to run 24/7/365 processing "requests" from different clients (with different needs)
we encounter a problem: application was getting slower and slower over time. Requesting user to restart server form time to time was out of question.
After days of profiling and debugging and (yes) praying for some sign what is wrong, I narrowed it down to 'new' and 'delete' - fragmentation.
We deal with non-constant number of requests that are different in size, and have life cycle of 'receive(create)-process(kill)'.
Problem was in request data size. Let me elaborate.
As requests are different in size, small memory allocations are interspersed with medium and large allocations.
Server is multi-threaded so processing of those requests takes different amount of time and results in freeing of blocks in a random order.
There is no way that we can ensure that memory is freed in optimal order (the optimal sequence being the exact reverse of the allocation order).
The end result is that there is a large number of small blocks that, when a medium or large allocation request comes, have to be skipped or merged,
and that in turn results in increasing delays when new allocation is requested.
I was faced with two possible approaches: globally or on a per-request (class) basis.
Globally - overloading global operator new and delete, writing a completely new memory manager that all classes and functions will use,
making sure that my memory manager is initialized first (before the C++ run-time system makes any memory requests but
after it creates the application heap),... to much sensitive work prone to errors and mistakes.
(No one can ever say I was ignorant of wide project implications... )
So, decision was to sort it out on per-request basis (big surprise there !!! )
I decided to preallocate memory for the maximum number of request that can come in any given time (in my case - not more that 2000 per second).
As I wanted to write one class to handle all cases but still be per-request basis, templates were good choice as any. And it needs to hand out memory when
needed and reuse that same memory when it is not needed. Using vector to simulate stack.
First version looked like this:
_MemoryPool( int nSize ) : m_Items( new T [ nSize ] )
m_FreeItems.reserve( nSize );
for( int i = 0; i < nSize; ++i )
m_FreeItems.push_back( &m_Items[i] );
delete  m_Items;
T * Allocate()
T * pItem = m_FreeItems.back();
return( pItem );
void Free( T * pItem )
m_FreeItems.push_back( pItem );
std::vector< T* > m_FreeItems;
T * m_Items;
In constructor of our memory pool class, requested number of objects will be created, and loaded into std::vector.
New allocation request will hand out top object from our vector (removing it from the vector in the process). When object is not needed any more, it will be
put back on top of our vector. Thus, vector here simulates stack with push-pop functionality.
unsigned long nActionBits;
Requests are data structures...
Now, creating memory pool with xxx number of objects.
_MemoryPool< RequestData1 > g_poolRequest1(2000);
So far so good, now all I need to do is write:
RequestData1* req1 = g_poolRequest1.Allocate();
new (req1) RequestData1( call, constructor, with, some, Parameters, if, needed);
g_poolRequest1.Free( req1 );
Well, I do not want to go to every piece of code where I create requests and change it to this construct (and release bits as well).
Solution: adding operator new and delete for each request class. Simple.
Now request class looks like this:
unsigned long nActionBits;
void * operator new( size_t n )
return( g_poolRequest1.Allocate() );
void operator delete( void * p )
g_poolRequest1.Free( reinterpret_cast<RequestData1 *>(p) );
And, all it's left to do now is recompile... (and pray that 2001 request never comes...)
That is all for now, test application is not included, and just as a side note, application even gained in speed using this approach, but that was not the end goal of this exercise.
More fragmentation problems were still present (point-finger-to-strings) but solution for that in some future article, perhaps.