Click here to Skip to main content
15,879,045 members
Articles / Programming Languages / C#

Using memory mapped files to conserve physical memory for large arrays

Rate me:
Please Sign up or sign in to vote.
4.78/5 (22 votes)
12 Nov 2009LGPL34 min read 163.9K   2.4K   90   37
The article shows how to implement a value type array as a memory mapped file to conserve physical memory.

Introduction

I work with large data structures which are kept in memory for speed, and over the years, I have encountered out of memory exceptions numerous times, especially on 32 bit systems. With 64 bit OS rapidly becoming the standard, we can now get a fast performing disk version of the structures using Memory Mapped Files for storage.

Large structures would, on 32 bit, be 800MB+, and on 64 bit, as much physical memory as you have available. When you move close to those boundaries, .NET is bound to give you an out of memory exception, which in most cases will break your application at unknown places.

Background

I have long thought about creating a disk based version of an array to store my data, but this would require a lot of caching logic to make it perform fast enough compared to physical memory.

A couple of years ago, I stumbled across Memory Mapped Files which has long existed in the Operating Systems, and is typically used in Windows for the swap space. With even more data in my current project and running on a 64 bit platform, the time seemed right to wrap this into a small library.

The last time I used a library from MetalWrench, but this time around, I got hold of Winterdom's much nicer implementation of the Win32 API. I've included the patch from Steve Simpson, but removed the dynamic paging since it slows things down and it's not necessary on 64 bit systems. (If you want to use arrays which hold over 2GB of data on 32 bit systems, I recommend reverting to Steve's original version and setting a view size of 200-500MB.)

The beauty of 64 bit is that you have virtually unlimited address space, so each thread can get its own view of the mapped file without running out of address space. 32 bit Windows can only address 4GB.

As for performance, my theory is that Microsoft has implemented a fairly good caching algorithm for its swap file, so it should prove good enough for me. A few tests show a much better disk IO with the Memory Mapped API than using .NET's file IO library. I haven't tested the performance if you add the SEC_LARGE_PAGES flag, but it might help some.

Requirements

The project is made in Visual Studio 2008 compiled against .NET 3.5, but you should be able to take the files and create projects in VS2005 as well. It should mostly be .NET 2.0 compatible. If I get enough requests, I'll create a VS2005 version.

The class accepts only value types. The reason is that the data is stored as bytes, and to keep track of the offset, TValue needs to be serialized to a defined size.

Implementation

The initial signature of my class looked like this:

C#
public class GenericMemoryMappedArray<TValue> : IDisposable, IEnumerable<TValue>
{
}

The problem I then encountered was that a user could create an array with whatever class which won't serialize to a defined size. To allow this, you would need a key file to store all the offsets in the value files as well. That will be left for my next project, implementing a memory mapped Dictionary.

So I ended up with:

C#
public class GenericMemoryMappedArray<TValue> : IDisposable, IEnumerable<TValue>
where TValue : struct
{
}

which restricts the usage to structs and value types. IDisposable is implemented in order to free up the MMF and delete the file used, and also release the unsafe memory areas allocated for working buffers. The buffers will be the size of the TValue, and calculated in the constructor. The constructor takes the size of the array and where to store the MMF as parameters.

C#
/// <summary>
/// Create a new memory mapped array on disk
/// </summary>
/// <param name="size">The length of the array to allocate</param>
/// <param name="path">The directory where the memory mapped file
///          is to be stored</param>
public GenericMemoryMappedArray(long size, string path)
{
    ...
    // Get the size of TValue
    _dataSize = Marshal.SizeOf(typeof(TValue));

    // Allocate a global buffer for this instance
    _buffer = new byte[_dataSize];

    // Allocate a global unmanaged buffer for this instance
    _memPtr = Marshal.AllocHGlobal(_dataSize);
    ...
}

The generic class has also implemented thread safety, so more than one thread can access the array at the same time. Since each thread has to set the position first, and then read or write, I keep a pool of all threads. .NET reuses internal threads so the pool will not grow very large. Even if it did, it's not a problem on 64 bit due to the large address space available. A timer runs every hour to clean up unused threads.

Here's an example of the Write method which gets the current thread ID, then gets a view of the MMF for that thread, and finally writes the data to the MMF.

C#
public void Write(byte[] buffer)
{
   int threadId = Thread.CurrentThread.ManagedThreadId;
   _lastUsedThread[threadId] = DateTime.UtcNow;
   Stream s = GetView(threadId);
   s.Write(buffer, 0, buffer.Length);
}

The class supports auto growing the array, and has a property for this which defaults to true. Useful for being able to add more data as you go along.

Using the Code

C#
string path = AppDomain.CurrentDomain.BaseDirectory;
var myList = new GenericMemoryMappedArray<int>(1024*1024, path);
using (myList) // automatically dispose the mmf when done
{
    myList.AutoGrow = false;
    try
    {
        myList[1024 * 1024] = 1;
    }
    catch (Exception e)
    {
        Console.WriteLine(e.Message);
        //will give exception
    }
    myList.AutoGrow = true;
    myList[0] = 1;
    myList[1024 * 1024] = 1; // will now increase the file
}

Conclusion

For my needs, memory mapped files has proven to be a good trade off between speed and usage of physical memory. Of course, physical memory is used for caching in the back end, but having the OS work its magic is much better than getting out of memory exceptions from .NET. I've tried with both sequential and random reads/writes, and it works pretty good. Be sure not to resize the array too often, as unmapping will flush the underlying pages. This will have an impact on performance if the structure is constantly used.

The code can also be modified to keep the temp files it uses for permanent storage and not deleting them when the class is disposed.

License

This article, along with any associated source code and files, is licensed under The GNU Lesser General Public License (LGPLv3)


Written By
Other Comperio
Norway Norway
I like to work with diverse technologies but spend most of my time doing .Net in various settings.

I code for fun!

Comments and Discussions

 
QuestionOverflowException Pin
UzairSyed10-Mar-16 19:06
UzairSyed10-Mar-16 19:06 
Questionmultidimensional array Pin
yokikooo12-Jan-14 22:57
yokikooo12-Jan-14 22:57 
QuestionIs it possible to declare a multidimensional array instead of a single table ? Pin
Mehdi Bugnard26-Apr-13 4:37
Mehdi Bugnard26-Apr-13 4:37 
AnswerRe: Is it possible to declare a multidimensional array instead of a single table ? Pin
Mikael Svenson26-Apr-13 4:42
Mikael Svenson26-Apr-13 4:42 
GeneralRe: Is it possible to declare a multidimensional array instead of a single table ? Pin
Mehdi Bugnard27-Apr-13 2:30
Mehdi Bugnard27-Apr-13 2:30 
QuestionHow read value from my GenericMemoryMappedArray ? Pin
Mehdi Bugnard25-Apr-13 4:54
Mehdi Bugnard25-Apr-13 4:54 
AnswerRe: How read value from my GenericMemoryMappedArray ? Pin
Mikael Svenson25-Apr-13 7:52
Mikael Svenson25-Apr-13 7:52 
GeneralRe: How read value from my GenericMemoryMappedArray ? Pin
Mehdi Bugnard26-Apr-13 4:14
Mehdi Bugnard26-Apr-13 4:14 
QuestionClone() is not correct Pin
Constantin Chumak29-May-12 6:29
Constantin Chumak29-May-12 6:29 
AnswerRe: Clone() is not correct Pin
Mikael Svenson29-May-12 21:41
Mikael Svenson29-May-12 21:41 
GeneralMy vote of 5 Pin
Ionegative2-Apr-12 22:28
Ionegative2-Apr-12 22:28 
GeneralRe: My vote of 5 Pin
Member 161051631-Oct-23 17:16
Member 161051631-Oct-23 17:16 
QuestionOut of memory error Pin
NaveenSoftwares29-Jan-12 18:08
NaveenSoftwares29-Jan-12 18:08 
QuestionTried to access item outside the array error Pin
poonam shedge22-Sep-11 21:14
poonam shedge22-Sep-11 21:14 
AnswerRe: Tried to access item outside the array error Pin
Mikael Svenson23-Sep-11 9:31
Mikael Svenson23-Sep-11 9:31 
GeneralCan't work on Windows 2003 server Pin
chinkuanyeh21-Jan-10 14:55
chinkuanyeh21-Jan-10 14:55 
Hi Mikael,

MemoryMappedFile can't work on Windows 2003 server due to security problem. Do you know how to solve it? Thank you.

Matthew
GeneralRe: Can't work on Windows 2003 server Pin
Mikael Svenson21-Jan-10 20:21
Mikael Svenson21-Jan-10 20:21 
GeneralNew release Pin
Mikael Svenson12-Nov-09 10:44
Mikael Svenson12-Nov-09 10:44 
QuestionError on MapViewOfFile API call Pin
Kelvin Lu2-Nov-09 9:43
Kelvin Lu2-Nov-09 9:43 
AnswerRe: Error on MapViewOfFile API call Pin
Kelvin Lu5-Nov-09 9:50
Kelvin Lu5-Nov-09 9:50 
GeneralRe: Error on MapViewOfFile API call Pin
Mikael Svenson5-Nov-09 9:58
Mikael Svenson5-Nov-09 9:58 
Generalanother minor issue when I use it. Pin
Kelvin Lu31-Oct-09 18:58
Kelvin Lu31-Oct-09 18:58 
GeneralRe: another minor issue when I use it. Pin
Mikael Svenson1-Nov-09 2:10
Mikael Svenson1-Nov-09 2:10 
GeneralRe: another minor issue when I use it. [modified] Pin
Kelvin Lu1-Nov-09 7:12
Kelvin Lu1-Nov-09 7:12 
GeneralRe: another minor issue when I use it. Pin
Mikael Svenson1-Nov-09 10:39
Mikael Svenson1-Nov-09 10:39 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.