Click here to Skip to main content
6,594,432 members and growing! (16,661 online)
Email Password   helpLost your password?
General Programming » Algorithms & Recipes » Data Structures     Intermediate License: The GNU Lesser General Public License

Using memory mapped files to conserve physical memory for large arrays

By Mikael Svenson

The article shows how to implement a value type array as a memory mapped file to conserve physical memory.
C# (C# 2.0, C# 3.0), .NET (.NET 2.0, .NET 3.5), Architect, Dev
Posted:22 May 2008
Views:17,680
Bookmarked:39 times
Unedited contribution
Announcements
Loading...
 
Search    
Advanced Search
Add to IE Search
printPrint   add Share
      Discuss Discuss   Broken Article?Report  
13 votes for this article.
Popularity: 5.15 Rating: 4.62 out of 5

1

2
1 vote, 7.7%
3
4 votes, 30.8%
4
8 votes, 61.5%
5

Introduction

I work with large data structures which are kept in memory for speed, and over the years I have encountered out of memory exceptions numerous times, especially on 32bit systems. With 64bit OS rapidly becoming the standard we can now get a fast performing disk version of the structures using Memory Mapped Files for storage.

Large structures would on 32bit be 800mb+ and on 64bit as much physical memory as you have available. When you move close to those boundaries .Net is bound to give you an out of memory exception which in most cases will break your application at unknown places.

Background

I have long thought about creating a disk based version of an array to store my data, but this would require a lot of caching logic to make it perform fast enough compared to physical memory.

A couple of years ago I stumbled across Memory Mapped Files which has long existed in the operating systems and is typically used in Windows for the swap space. With even more data in my current project and running on a 64bit platform, the time seemed right to wrap this into a small library.

Last time I used a library from MetalWrench, but this time around I got hold of Winterdom's much nicer implementation of the Win32 API. I've included the patch from Steve Simpson, but removed the dynamic paging since it slows things down and it's not neccessary on 64bit systems. (If you want to use arrays which hold over 2gb of data on 32bit systems I recommend reverting to Steve's original version and set a view size of 200-500mb.)

The beauty of 64bit is that you have virtually unlimited address space, so each thread can get it's own view of the mapped file without running out of address space. 32bit Windows can only address 4gb.

As for performance my theory is that Microsoft has implemented a fairly good caching algorithm for it's swap file, so it should prove good enough for me. A few tests show a much better disk IO with the Memory Mapped API than using .Net's file IO library. I haven't testet the performance if you add the SEC_LARGE_PAGES flag, but it might help some.

Requirements

The project is made in Visual Studio 2008 compiled against .Net 3.5, but you should be able to take the files and create projects in VS2005 as well. It should mostly be .Net 2.0 compatible. If I get enough requests I'll create a VS2005 version.

The class accepts only value types. The reason is that the data is stored as bytes and to keep track of the offset TValue needs to be serialized to a defined size.

Implementation

The initial signature of my class looked like this:

public class GenericMemoryMappedArray<TValue> : IDisposable, IEnumerable<TValue>
{
}

The problem I then encounted was that a user could create an array with whatever class which won't serialize to a defined size. To allow this you would need a key file to store all the offsets in the value files as well. That will be left for my next project implementing a memory mapped Dictionary.

So I ended up with:

public class GenericMemoryMappedArray<TValue> : IDisposable, IEnumerable<TValue>
where TValue : struct
{
}

which restricts the usage to structs and value types. IDisposable is implemented in order to free up the mmf and delete the file used, and also release the unsafe memory areas allocated for working buffers. The buffers will be the size of the TValue, and calculated in the constructor. The constructor takes the size of the array and where to store the mmf as parameters.

/// <summary>
/// Create a new memory mapped array on disk
/// </summary>
/// <param name="size">The length of the array to allocate</param>
/// <param name="path">The directory where the memory mapped file is to be stored</param>
public GenericMemoryMappedArray(long size, string path)
{
    ...
    // Get the size of TValue
    _dataSize = Marshal.SizeOf(typeof(TValue));

    // Allocate a global buffer for this instance
    _buffer = new byte[_dataSize];

    // Allocate a global unmanaged buffer for this instance
    _memPtr = Marshal.AllocHGlobal(_dataSize);
    ...
}

The generic class has also implemented thread safety, so more than one thread can access the array at the same time. Since each thread has to set the position first, and then read or write, I keep a pool of all threads. .Net reuses internal threads so the pool will not grow very large. Even if it did, it's not a problem on 64bit due to the large address space available. A timer runs every hour to clean up unused threads.

Here's an example of the Write method which gets the current thread id, then gets a view of the mmf for that thread, and finally writes the data to the mmf.

public void Write(byte[] buffer)
{
   int threadId = Thread.CurrentThread.ManagedThreadId;
   _lastUsedThread[threadId] = DateTime.UtcNow;
   Stream s = GetView(threadId);
   s.Write(buffer, 0, buffer.Length);
}

The class supports auto growing the array and has a property for this which default is true. Useful for being able to add more data as you go along.

Using the code

string path = AppDomain.CurrentDomain.BaseDirectory;            
var myList = new GenericMemoryMappedArray<int>(1024*1024, path); 
using (myList) // automatically dispose the mmf when done
{
    myList.AutoGrow = false;
    try
    {
        myList[1024 * 1024] = 1;
    }
    catch (Exception e)
    {
        Console.WriteLine(e.Message);
        //will give exception
    }
    myList.AutoGrow = true;
    myList[0] = 1;
    myList[1024 * 1024] = 1; // will now increase the file
}

Conclusion

For my needs memory mapped files has proven to be a good trade off between speed and usage of physical memory. Of course physical memory is used for caching in the back end, but having the OS work it's magic is much better than getting out of memory exceptions from .Net. I've tried with both sequential and random reads/writes and it works pretty good imo. Be sure not to resize the array too often, as unmapping will flush the underlying pages. This will have an impact on performance if the structure is constantly used.

The code can also be modified to keep the temp files it uses for permanent storage and not deleting them when the class is disposed.

License

This article, along with any associated source code and files, is licensed under The GNU Lesser General Public License

About the Author

Mikael Svenson


Member
I have worked in the IT industry since 1996, when I jumped off my computer science studies at the University in Oslo, in order to pursue a dream of making computer games at FunCom. I have also my own consulting company where I do small projets from time to time.

Before joining Bouvet I was the CTO and Chief Architect at IntelliSearch working with enterprise search solutions, and have also done a lot of work with web crawling since 2000.

I like to work with diverse technologies and languages and often revert back to perl and scripting on unix for solving trivial tasks. There's nothing like a good shell!
Occupation: Other
Company: Bouvet
Location: Norway Norway

Other popular Algorithms & Recipes articles:

Article Top
You must Sign In to use this message board.
FAQ FAQ 
 
Noise Tolerance  Layout  Per page   
 Msgs 1 to 19 of 19 (Total in Forum: 19) (Refresh)FirstPrevNext
QuestionError on MapViewOfFile API call PinmemberKelvin Lu10:43 2 Nov '09  
AnswerRe: Error on MapViewOfFile API call PinmemberKelvin Lu10:50 5 Nov '09  
GeneralRe: Error on MapViewOfFile API call PinmemberMikael Svenson10:58 5 Nov '09  
Generalanother minor issue when I use it. PinmemberKelvin Lu19:58 31 Oct '09  
GeneralRe: another minor issue when I use it. PinmemberMikael Svenson3:10 1 Nov '09  
GeneralRe: another minor issue when I use it. [modified] PinmemberKelvin Lu8:12 1 Nov '09  
GeneralRe: another minor issue when I use it. PinmemberMikael Svenson11:39 1 Nov '09  
GeneralRe: another minor issue when I use it. PinmemberKelvin Lu7:13 2 Nov '09  
General2005 version? PinmemberZi Qian6:02 17 Sep '09  
GeneralRe: 2005 version? PinmemberMikael Svenson9:58 17 Sep '09  
GeneralStruct != fixed size PinmemberD Waterworth13:03 27 Jan '09  
GeneralRe: Struct != fixed size PinmemberMikael Svenson22:37 27 Jan '09  
GeneralRe: Struct != fixed size PinmemberD Waterworth12:00 28 Jan '09  
GeneralRe: Struct != fixed size PinmemberMikael Svenson10:38 29 Jan '09  
GeneralRe: Struct != fixed size Pinmembertorial10:38 21 Mar '09  
GeneralRe: Struct != fixed size PinmemberMikael Svenson23:31 21 Mar '09  
GeneralRe: Struct != fixed size PinmemberDominic Bush2:38 7 Jul '09  
Generalminor bugs PinmemberMike Mestemaker7:54 28 May '08  
AnswerRe: minor bugs PinmemberMikael Svenson22:19 28 May '08  

General General    News News    Question Question    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

PermaLink | Privacy | Terms of Use
Last Updated: 22 May 2008
Editor:
Copyright 2008 by Mikael Svenson
Everything else Copyright © CodeProject, 1999-2009
Web22 | Advertise on the Code Project