Unsafe programming in C#

Kumar Gaurav Khanna

4.29/5 (41 votes)

May 30, 2002

5 min read

208518

Discusses the concept of using pointers in C#

Introduction

There's always one particular topic that catches the fancy of most C/C++ programmers, and is considered quite complicated and difficult to understand: pointers!

Albeit, whenever C# is discussed, most of the people I have come across are of the opinion (and pretty strong one, if I may add) that C# carries no concept of pointers. In fact, it's done away with it. However, unsafe code is that part of C# programming, which is all about programming with pointers. Unlike it's literal meaning, there is nothing unsafe about programming with pointers.

The reason it is so referred to as is because, unlike the conventional .NET development that is done, unsafe programming requires certain assumptions on the part of the programmer. In this article, I shall start off by differentiating two highly confused terms, unsafe code and unmanaged code. This will be followed by discussion of how to write unsafe code, that is, how to use pointers in C#.

Unsafe or unmanaged? That is the question

Managed code is that code which executes under the supervision of the CLR. The CLR is responsible for various housekeeping tasks, like:

managing memory for the objects
performing type verification
doing garbage collection

just to name a few. The user is totally isolated from the how's of the above mentioned tasks. The user doesn't get to manipulate the memory directly, because that is done by the CLR.

On the other hand, unmanaged code is that code which executes outside the context of the CLR. The best example of this is our traditional Win32 DLLs like kernel32.dll, user32.dll, and the COM components installed on our system. How they allocate memory for their usage, how memory is released, how (if any) type verification takes places are some of the tasks that are undertaken by them on their own. A typical C++ program which allocates memory to a character pointer is another example of unmanaged code because you as the programmer are responsible for:

calling the memory allocation function
making sure that the casting is done right
making sure that the memory is released when the work is done

If you notice, all this housekeeping is done by the CLR, as explained above, relieving the programmer of the burden.

Unsafe code is a kind of cross between the managed and unmanaged codes

It executes under the supervision of the CLR, just like the managed code, but lets you address the memory directly, through the use of pointers, as is done in unmanaged code. Thus, you get the best of both worlds. You might be writing a .NET application that uses the functionality in a legacy Win32 DLL, whose exported functions require the use of pointers. That's where unsafe code comes to your rescue.

Now that we have gone through the distinctions, let's get coding... unarguably the best part, what do you think?

Inside unsafe code

Writing unsafe code requires the use of two special keywords: unsafe and fixed. If we recall, there are three kinds of pointer operators:

*
&
->

Any statement, or a block of code, or a function that uses any of the above pointer operators is marked as unsafe through the use of the unsafe keyword, as shown below:

public unsafe void Triple(int *pInt)
{
  *pInt=(*pInt)*3;
}

All the above function does is triple the value passed to it. But notice that the address of the variable, containing the value to be tripled, is passed to the function. The function then does its work. Since the function is using the "*" pointer operator, the function is marked as unsafe, since the memory is being directly manipulated.

However, there is a problem. If you recall from the discussion above, unsafe code is managed code, and hence, is being executed under the CLR's supervision. Now, the CLR is free to move the objects in memory. One plausible reason could be to reduce the memory fragmentation. But in doing so, unknowingly and transparently to the programmer, the variable being pointed to could be get relocated to some other memory locations.

So, if *pInt pointed to a variable which was at address 1001, and the CLR performs some memory relocation to reduce fragmentation, the variable which was earlier located at 1001 could, after relocation, be stored at memory location 2003. This is a catastrophe, since the pointer becomes invalid as there is nothing at memory location 1001 after relocation! Probably that's one of the reason usage of pointers has been made to keep a low profile in .NET. What do you think?

Fixing the pointers

Enter the fixed keyword. When used for a block of statements, it tells the CLR that the object in question cannot be relocated, and thus, it ends up pinning the object. Thus, when pointers are used in C#, the fixed keyword is used pretty often to prevent invalid pointers at runtime. Lets have a look at how it works:

using System;
class CData
{
    public int x;
}

class CProgram
{
    unsafe static void SetVal(int *pInt)
    {
        *pInt=1979;
    }
    
    public unsafe static void Main()
    {
        CData d = new CData();
        
        Console.WriteLine("Previous value: {0}", d.x);
        
        fixed(int *p=&d.x)
        {
            SetVal(p);
        }
        
        Console.WriteLine("New value: {0}", d.x);
    }
}

All we do here is assign the address of field x of class CData to integer pointer p, within the fixed block. Now, while statements within the fixed block are executing, the pointer shall continue to point to the same memory location because the CLR has been instructed to pin the variable until the fixed block execution finishes. Once the fixed block is done, the object can be relocated in memory by the CLR.

That's all there is to programming using pointers in C#. Just make sure that the block is unsafe and that the object being pointed to is fixed. And you are ready to leverage your knowledge of pointers in C# too!