As documented elsewhere, .NET memory management consists of two heaps: the Small Object Heap (SOH) for most situations, and the Large Object Heap (LOH) for objects around 80 KB or larger.
The SOH is garbage collected and compacted in clever ways I won't go into here.
The LOH, on the other hand, is garbage collected, but not compacted. If you're working with large objects, then listen up! Since the LOH is not compacted, you'll find that either:
- you're running an x86 program and you get
OutOfMemory exceptions, or
- you're running x64, and memory usage becomes unacceptably high. Welcome to the world of a fragmented heap, just like the good old C++ days.
So where does this leave us? Let's code our way out of this jam!
If you work with streams much, you know that
MemoryStream uses an internal byte array and wraps it in
Stream clothing. Memory shenanigans made easy! However, if a
MemoryStream's buffer gets big, that buffer ends up on the LOH, and you're in trouble. So let's improve on things.
Our first attempt was to create a
Stream-like class that used a
MemoryStream up to 64 KB, then switched to a
FileStream with a temp file after that. Sadly, disk I/O killed the throughput of our application. And just letting
MemoryStreams grow and grow caused unacceptably high memory usage.
So let's make a new
Stream-derived class, and instead of one internal byte array, let's go with a list of byte arrays, none large enough to end up on the LOH. Simple enough. And let's keep a global
ConcurrentQueue of these little byte arrays for our own buffer recycling scheme.
MemoryStreams are really useful for their dual role of stream and buffer so you can do things like...
string str = Encoding.UTF8.GetString(memStream.GetBuffer(), 0, (int)memStream.Length);
So let's also work with
MemoryStreams, and let's keep another
ConcurrentQueue of these streams for our own recycling scheme. When a stream to recycle is too big, let's chop it down before enqueuing it. As long as the streams stay under 8X of little buffer size, we just let it ride, and the folks requesting streams get something with Capacity between the little buffer size and 8X the little buffer size. If a stream ends up on the LOH, we pull it back to the SOH when it gets recycled.
Finally, for x86 apps, you should have a timer run garbage collection with LOH defragmentation like so:
GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce;
We run this once a minute for one of our x86 applications. It only takes a few milliseconds to run, and, in conjunction with buffer and stream recycling, memory usage stays manageable.
Check out the attached class library for the
BcMemoryStream, and the overall memory manager,
BcMemoryMgr. There's a third class, an
IDisposable class call
MemoryStreamUse, which manages recycling of
You just tell
BcMemoryMgr how big you want the little buffers to be and how many buffers and
MemoryStreams to recycle. A buffer size of 8 KB is good because 8 X 8 KB -> 64 KB max size for
MemoryStreams, which is under the 80 KB LOH threshold. You have to make peace with the max number of objects to keep in the recycling queues. For an x86 program and 8 KB buffers and heavy
MemoryStream use, you might not want to allow a worst case of 8 X 8 KB X 10,000 -> 640 MB to get locked up in this system. With a maximum queue length of 1,000, you're only committing to a max of 64 MB, which seems a low price to pay for buffer and stream recycling. To be clear, the memory is not pre-allocated; the max count is just how large the recycling
ConcurrentQueues can get before buffers and streams are let go for normal garbage collection.
Let's look at each class in detail.
Let's start with
BcMemoryMgr. It's a small
static class. It has
ConcurrentQueues for recycling buffers and streams, the buffer size, and the max queue length.
Looking at the member functions, you Init with the buffer size and max queue length, and it calls a
SelfTest routine that tests the class library. If the tests don't pass, the code doesn't run...poor man's unit testing. Note that you can specify a zero max queue length, in which case you'll get no recycling, just normal garbage collection. There are buffer functions
FreeBuffer ... anybody remember malloc and free? There are stream functions
FreeStream chops the stream down if it's too big before enqueuing it for reuse.
Stream-derived and implements pretty much the same interface as
MemoryStream. One notable exception is that you cannot set the
Capacity property. Instead, there is a
Reset function you can call to free all buffers in the class, returning the
Capacity to zero. The fun code is in Read and Write;
Buffer.BulkCopy came in handy. There are extension routines for working with
strings. These were handy when writing
BcMemoryStreamUse is a small
IDisposable class uses
BcMemoryMgr to allocate a
MemoryStream in its constructor and free it in its
Stream recycling is fun and easy!
BcMemoryBufferUser is similar to
BcMemoryStream, just for byte arrays. Buffer recycling is fun and easy!
MemoryStream up to a test, hashing all files in a directory and its subdirectories. For each file, it starts with a
CopyTo's into either
BcMemoryTest, the does the hashing. In our tests, we use a leafy and varied test directory with lots of built binaries. The performance for the total run time for
BcMemoryStream was about 25% faster than
MemoryStream. So not only did we solve the LOH problem, we get a better end result. Hooray!
Finally, there's a little config file addition that is absolutely necessary for server applications:
This gives you background garbage collection, a heap per processor, etc. A real lifesaver.
Hope this helps!
Michael Balloni is a manager of software development at a cybersecurity software and services provider.
Check out https://www.michaelballoni.com for all the programming fun he's done over the years.
He has been developing software since 1994, back when Mosaic was the web browser of choice. IE 4.0 changed the world, and Michael rode that wave for five years at a .com that was a cloud storage system before the term "cloud" meant anything. He moved on to a medical imaging gig for seven years, working up and down the architecture of a million-lines-code C++ system.
Michael has been at his current cybersecurity gig since then, making his way into management. He still loves to code, so he sneaks in as much as he can at work and at home.