Click here to Skip to main content
Click here to Skip to main content

Using memory mapped files to conserve physical memory for large arrays

, 12 Nov 2009
Rate this:
Please Sign up or sign in to vote.
The article shows how to implement a value type array as a memory mapped file to conserve physical memory.

Introduction

I work with large data structures which are kept in memory for speed, and over the years, I have encountered out of memory exceptions numerous times, especially on 32 bit systems. With 64 bit OS rapidly becoming the standard, we can now get a fast performing disk version of the structures using Memory Mapped Files for storage.

Large structures would, on 32 bit, be 800MB+, and on 64 bit, as much physical memory as you have available. When you move close to those boundaries, .NET is bound to give you an out of memory exception, which in most cases will break your application at unknown places.

Background

I have long thought about creating a disk based version of an array to store my data, but this would require a lot of caching logic to make it perform fast enough compared to physical memory.

A couple of years ago, I stumbled across Memory Mapped Files which has long existed in the Operating Systems, and is typically used in Windows for the swap space. With even more data in my current project and running on a 64 bit platform, the time seemed right to wrap this into a small library.

The last time I used a library from MetalWrench, but this time around, I got hold of Winterdom's much nicer implementation of the Win32 API. I've included the patch from Steve Simpson, but removed the dynamic paging since it slows things down and it's not necessary on 64 bit systems. (If you want to use arrays which hold over 2GB of data on 32 bit systems, I recommend reverting to Steve's original version and setting a view size of 200-500MB.)

The beauty of 64 bit is that you have virtually unlimited address space, so each thread can get its own view of the mapped file without running out of address space. 32 bit Windows can only address 4GB.

As for performance, my theory is that Microsoft has implemented a fairly good caching algorithm for its swap file, so it should prove good enough for me. A few tests show a much better disk IO with the Memory Mapped API than using .NET's file IO library. I haven't tested the performance if you add the SEC_LARGE_PAGES flag, but it might help some.

Requirements

The project is made in Visual Studio 2008 compiled against .NET 3.5, but you should be able to take the files and create projects in VS2005 as well. It should mostly be .NET 2.0 compatible. If I get enough requests, I'll create a VS2005 version.

The class accepts only value types. The reason is that the data is stored as bytes, and to keep track of the offset, TValue needs to be serialized to a defined size.

Implementation

The initial signature of my class looked like this:

public class GenericMemoryMappedArray<TValue> : IDisposable, IEnumerable<TValue>
{
}

The problem I then encountered was that a user could create an array with whatever class which won't serialize to a defined size. To allow this, you would need a key file to store all the offsets in the value files as well. That will be left for my next project, implementing a memory mapped Dictionary.

So I ended up with:

public class GenericMemoryMappedArray<TValue> : IDisposable, IEnumerable<TValue>
where TValue : struct
{
}

which restricts the usage to structs and value types. IDisposable is implemented in order to free up the MMF and delete the file used, and also release the unsafe memory areas allocated for working buffers. The buffers will be the size of the TValue, and calculated in the constructor. The constructor takes the size of the array and where to store the MMF as parameters.

/// <summary>
/// Create a new memory mapped array on disk
/// </summary>
/// <param name="size">The length of the array to allocate</param>
/// <param name="path">The directory where the memory mapped file
///          is to be stored</param>
public GenericMemoryMappedArray(long size, string path)
{
    ...
    // Get the size of TValue
    _dataSize = Marshal.SizeOf(typeof(TValue));

    // Allocate a global buffer for this instance
    _buffer = new byte[_dataSize];

    // Allocate a global unmanaged buffer for this instance
    _memPtr = Marshal.AllocHGlobal(_dataSize);
    ...
}

The generic class has also implemented thread safety, so more than one thread can access the array at the same time. Since each thread has to set the position first, and then read or write, I keep a pool of all threads. .NET reuses internal threads so the pool will not grow very large. Even if it did, it's not a problem on 64 bit due to the large address space available. A timer runs every hour to clean up unused threads.

Here's an example of the Write method which gets the current thread ID, then gets a view of the MMF for that thread, and finally writes the data to the MMF.

public void Write(byte[] buffer)
{
   int threadId = Thread.CurrentThread.ManagedThreadId;
   _lastUsedThread[threadId] = DateTime.UtcNow;
   Stream s = GetView(threadId);
   s.Write(buffer, 0, buffer.Length);
}

The class supports auto growing the array, and has a property for this which defaults to true. Useful for being able to add more data as you go along.

Using the Code

string path = AppDomain.CurrentDomain.BaseDirectory;
var myList = new GenericMemoryMappedArray<int>(1024*1024, path);
using (myList) // automatically dispose the mmf when done
{
    myList.AutoGrow = false;
    try
    {
        myList[1024 * 1024] = 1;
    }
    catch (Exception e)
    {
        Console.WriteLine(e.Message);
        //will give exception
    }
    myList.AutoGrow = true;
    myList[0] = 1;
    myList[1024 * 1024] = 1; // will now increase the file
}

Conclusion

For my needs, memory mapped files has proven to be a good trade off between speed and usage of physical memory. Of course, physical memory is used for caching in the back end, but having the OS work its magic is much better than getting out of memory exceptions from .NET. I've tried with both sequential and random reads/writes, and it works pretty good. Be sure not to resize the array too often, as unmapping will flush the underlying pages. This will have an impact on performance if the structure is constantly used.

The code can also be modified to keep the temp files it uses for permanent storage and not deleting them when the class is disposed.

License

This article, along with any associated source code and files, is licensed under The GNU Lesser General Public License (LGPLv3)

About the Author

Mikael Svenson
Other Comperio
Norway Norway
I like to work with diverse technologies but spend most of my time doing .Net in various settings.
 
I code for fun!

Comments and Discussions

 
Questionmultidimensional array Pinmemberyokikooo12-Jan-14 22:57 
QuestionIs it possible to declare a multidimensional array instead of a single table ? PinmemberMehdi Bugnard26-Apr-13 4:37 
AnswerRe: Is it possible to declare a multidimensional array instead of a single table ? PinmemberMikael Svenson26-Apr-13 4:42 
GeneralRe: Is it possible to declare a multidimensional array instead of a single table ? PinmemberMehdi Bugnard27-Apr-13 2:30 
QuestionHow read value from my GenericMemoryMappedArray ? PinmemberMehdi Bugnard25-Apr-13 4:54 
Hello. To start thank you for your fabulous project! I just do not understand how to read the values ​​stored in my array..
 
How is it possible to read the value stored in my table [GenericMemoryMappedArray] ?
 

var myList = new GenericMemoryMappedArray<int>(2048L*1024L*2048L, path);
 
            using (myList)
            {
                myList.AutoGrow = false;
 
                myList[1939848234] = 1;
 
                if (myList[2] != null)
                {
                    //does not work ?? ..
                    Console.WriteLine(myList[2]);
                }
                if (myList[1939848234] != null)
                {
                    //does not work ?? ..
                    Console.WriteLine(myList[1939848234]);
                }
            }
Thanks a lot ^^
AnswerRe: How read value from my GenericMemoryMappedArray ? PinmemberMikael Svenson25-Apr-13 7:52 
GeneralRe: How read value from my GenericMemoryMappedArray ? [modified] PinmemberMehdi Bugnard26-Apr-13 4:14 
QuestionClone() is not correct PinmemberConstantin Chumak29-May-12 6:29 
AnswerRe: Clone() is not correct PinmemberMikael Svenson29-May-12 21:41 
GeneralMy vote of 5 PinmemberIonegative2-Apr-12 22:28 
QuestionOut of memory error PinmemberNaveenSoftwares29-Jan-12 18:08 
QuestionTried to access item outside the array error Pinmemberpoonam shedge22-Sep-11 21:14 
AnswerRe: Tried to access item outside the array error PinmemberMikael Svenson23-Sep-11 9:31 
GeneralCan't work on Windows 2003 server Pinmemberchinkuanyeh21-Jan-10 14:55 
GeneralRe: Can't work on Windows 2003 server PinmemberMikael Svenson21-Jan-10 20:21 
GeneralNew release PinmemberMikael Svenson12-Nov-09 10:44 
QuestionError on MapViewOfFile API call PinmemberKelvin Lu2-Nov-09 9:43 
AnswerRe: Error on MapViewOfFile API call PinmemberKelvin Lu5-Nov-09 9:50 
GeneralRe: Error on MapViewOfFile API call PinmemberMikael Svenson5-Nov-09 9:58 
Generalanother minor issue when I use it. PinmemberKelvin Lu31-Oct-09 18:58 
GeneralRe: another minor issue when I use it. PinmemberMikael Svenson1-Nov-09 2:10 
GeneralRe: another minor issue when I use it. [modified] PinmemberKelvin Lu1-Nov-09 7:12 
GeneralRe: another minor issue when I use it. PinmemberMikael Svenson1-Nov-09 10:39 
GeneralRe: another minor issue when I use it. PinmemberKelvin Lu2-Nov-09 6:13 
Question2005 version? PinmemberZi Qian17-Sep-09 5:02 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web02 | 2.8.140718.1 | Last Updated 12 Nov 2009
Article Copyright 2008 by Mikael Svenson
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid