Unsafe programming in C#






4.29/5 (41 votes)
May 30, 2002
5 min read

208034

2
Discusses the concept of using pointers in C#
Introduction
There's always one particular topic that catches the fancy of most C/C++ programmers, and is considered quite complicated and difficult to understand: pointers!
Albeit, whenever C# is discussed, most of the people I have come across are of the opinion (and pretty strong one, if I may add) that C# carries no concept of pointers. In fact, it's done away with it. However, unsafe code is that part of C# programming, which is all about programming with pointers. Unlike it's literal meaning, there is nothing unsafe about programming with pointers.
The reason it is so referred to as is because, unlike the conventional .NET development that is done, unsafe programming requires certain assumptions on the part of the programmer. In this article, I shall start off by differentiating two highly confused terms, unsafe code and unmanaged code. This will be followed by discussion of how to write unsafe code, that is, how to use pointers in C#.
Unsafe or unmanaged? That is the question
Managed code is that code which executes under the supervision of the CLR. The CLR is responsible for various housekeeping tasks, like:
- managing memory for the objects
- performing type verification
- doing garbage collection
just to name a few. The user is totally isolated from the how's of the above mentioned tasks. The user doesn't get to manipulate the memory directly, because that is done by the CLR.
On the other hand, unmanaged code is that code which executes outside the context of the CLR. The best example of this is our traditional Win32 DLLs like kernel32.dll, user32.dll, and the COM components installed on our system. How they allocate memory for their usage, how memory is released, how (if any) type verification takes places are some of the tasks that are undertaken by them on their own. A typical C++ program which allocates memory to a character pointer is another example of unmanaged code because you as the programmer are responsible for:
- calling the memory allocation function
- making sure that the casting is done right
- making sure that the memory is released when the work is done
If you notice, all this housekeeping is done by the CLR, as explained above, relieving the programmer of the burden.
Unsafe code is a kind of cross between the managed and unmanaged codes
It executes under the supervision of the CLR, just like the managed code, but lets you address the memory directly, through the use of pointers, as is done in unmanaged code. Thus, you get the best of both worlds. You might be writing a .NET application that uses the functionality in a legacy Win32 DLL, whose exported functions require the use of pointers. That's where unsafe code comes to your rescue.
Now that we have gone through the distinctions, let's get coding... unarguably the best part, what do you think?
Inside unsafe code
Writing unsafe code requires the use of two special keywords: unsafe
and fixed
.
If we recall, there are three kinds of pointer operators:
*
&
->
Any statement, or a block of code, or a function that uses any of the above
pointer operators is marked as unsafe through the use of the unsafe
keyword,
as shown below:
public unsafe void Triple(int *pInt)
{
*pInt=(*pInt)*3;
}
All the above function does is triple the value passed to it. But notice that the address of the variable, containing the value to be tripled, is passed to the function. The function then does its work. Since the function is using the "*" pointer operator, the function is marked as unsafe, since the memory is being directly manipulated.
However, there is a problem. If you recall from the discussion above, unsafe code is managed code, and hence, is being executed under the CLR's supervision. Now, the CLR is free to move the objects in memory. One plausible reason could be to reduce the memory fragmentation. But in doing so, unknowingly and transparently to the programmer, the variable being pointed to could be get relocated to some other memory locations.
So, if *pInt
pointed to a variable which was at address
1001, and the CLR performs some memory relocation to reduce fragmentation, the
variable which was earlier located at 1001 could, after relocation, be stored
at memory location 2003. This is a catastrophe, since the pointer becomes invalid
as there is nothing at memory location 1001 after relocation! Probably that's
one of the reason usage of pointers has been made to keep a low profile in .NET.
What do you think?
Fixing the pointers
Enter the fixed
keyword. When used for a block of statements, it tells
the CLR that the object in question cannot be relocated, and thus, it ends
up pinning the object. Thus, when pointers are used in C#, the fixed
keyword is used pretty often to prevent invalid pointers at runtime.
Lets have a look at how it works:
using System;
class CData
{
public int x;
}
class CProgram
{
unsafe static void SetVal(int *pInt)
{
*pInt=1979;
}
public unsafe static void Main()
{
CData d = new CData();
Console.WriteLine("Previous value: {0}", d.x);
fixed(int *p=&d.x)
{
SetVal(p);
}
Console.WriteLine("New value: {0}", d.x);
}
}
All we do here is assign the address of field x
of class CData
to integer pointer
p
, within the fixed block. Now, while statements within the fixed block are
executing, the pointer shall continue to point to the same memory location because
the CLR has been instructed to pin the variable until the fixed block execution
finishes. Once the fixed block is done, the object can be relocated in memory by the CLR.
That's all there is to programming using pointers in C#. Just make sure
that the block is unsafe
and that the object being pointed to is fixed
. And you are ready to
leverage your knowledge of pointers in C# too!