Table of Contents
In this article, I will try to clarify certain points about pointers and their usage. This is for beginners.
Sometimes on the web, I do run across numerous discussions about pointers in languages such as C or C++ where heated debates occur about whether it is worth it or even wise to use pointers at all. The question seems to persist even today. So hopefully to clarify a few things, I decided to create this article. I am not going to be overly technical geek about the subject nor will I religiously demand that each new
must have corresponding delete
or each malloc
must have corresponding free
.
This article applies to C and C++ where you have no choice but to deal with the raw pointers. Other languages like Java, C#, etc. hide all the voodoo magic behind the scenes. It is also for beginners who do not code in ASM, C or C++ but are mystified by the subject.
I will try to cover all of the cases without going too deep into the technical details. If you want a very deep technical knowledge of the pointers, there is plenty of information available from Wikipedia to numerous blogs. The cplusplus dot com website has a fantastic article about pointers, so I am not going to duplicate it here. I am going to cover here following scenarios, given that you are developing in C or C++:
- When is it impossible to implement something without the use of pointers
- Advantages of using pointers
- Passing variables by value, reference, or pointer
- Dangling pointers
A pointer is an integer variable that holds an address where the specific width (float
, double
, int
, struct
, class
, etc.) value is stored in the computer memory. And as such, a pointer is the ‘de facto’ object that the computer understands natively on the metal level. A pointer is always an unsigned integer of 8, 16, 32, 64, 128, etc. bits in width. This is largely dictated by the CPU's main register widths. It is also dictated by the operating system runtime bit alignment. It is entirely possible to run 16 bit OS on a 64 bit CPU, but not vice versa. However, in the 16 bit OS, you’ll be limited to the 16 bit address space, even if the CPU is 64 bit wide.
Sidetrack: Since the 64 bit registers rather conveniently wide enough for the aforementioned 16 bit OS, it is possible to write a special segmented memory manager that can peek beyond 64Kb limit but only if the physical register is wide enough (oh the old 16 bit Windows). The subject of the 16 bit memory segments and offsets is largely obsolete today. In our current discussion, I am addressing flat memory only.
The width of the pointer is directly related to the addressable memory this pointer can hold and it is a two in the power of its width minus one.
By looking at this table you can see what was the past and what will be the future.
If you were to awake in some sifi novel realm, the first thing you want to check is the addressable pointer memory on the computers they use, to gather an idea what are you dealing with.
Width in bits
| Two’s Complement
| Top of Addressable memory(bytes)
| Hex Representation
|
8 (Ancient history)
| 2^8 - 1
| 255
| 0xFF
|
16 (DOS, Win 3.0)
| 2^16 - 1
| 65,535 (64Kb)
| 0xFFFF
|
32
| 2^32 - 1
| 4,294,967,295 (4Gb)
| 0xFFFFFFFF
|
64
| 2^64 - 1
| 18,446,744,073,709,551,616 (18 Exabytes or 18 million terabytes)
| 0xFFFFFFFFFFFFFFFF
|
128
| 2^128 - 1
| No such system yet exist
| 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
|
128-bit registers do exist in today’s 64-bit processors and usually are used for the SIMD (Single Instruction Multiple Data) opcodes. Long story short – they allow to load, say four 32 bit integers into the same register side by side, and perform several instructions, like MulDiv and such, in one CPU cycle (hertz). Worthy perspective: One Exabyte equals to one million Terabytes.
A pointer to a pointer is an integer variable that holds an address of the pointer. It is mostly used as a return value from a function. It is also used in places where the danger of the dangling pointers may arise.
void* ptr = nullptr;
SomeFuncReturnsPtr(&ptr);
ptr->DoStuff();
Pointers that are passed to different functions as a straight pointer (aka a pointer copy) face danger that if that function deletes the pointer, another copy of that pointer in another function is not nullified. It still holds the address of the wiped out memory and thus becomes a dangling pointer because its state seems to be valid (not null
). Any usage of such a pointer results is an undefined runtime behavior. This is a real danger in an unmanaged world that needs to be confronted preferably during the coding phase as it is notoriously difficult to find during runtime.
The following code defensively deals with the possibility of the dangling pointers:
void main()
{
A* ptr = new A;
NukeA(ptr);
assert(ptr == nullptr); delete ptr; }
void main()
{
A* ptr = new A;
NukeSafelyA (&ptr);
assert(ptr == nullptr);
}
void NukeA(A* p)
{
p->DoThis();
p->DoThat();
delete p;
p = nullptr;
}
void NukeSafelyA(A** p)
{
(*p)->DoThis();
(*p)->DoThat();
delete *p; *p = nullptr;
}
As a matter of fact, if you are using pointers to dynamically allocated memory, do not pass them by the pointer to another function ever, especially if that function can or may delete it. Pass it only by the pointer to a pointer and then your pointer will become null
if it, in fact, was deleted and nullified in another function. This is an incredibly simplified example but it does demonstrate when this can happen.
Computer metal does not care or even knows what a variable is or what it is named. All it understands is the CPU registers and the memory addresses - pointers. That being said, the compiler creates an illusion by the variable declaration that allows the CPU to associate it with the internal register and the memory address where to load this value from.
All local variables are declared and reside on a function stack frame. A pointer to dynamically allocated memory resides also on a stack frame but it points into the program’s Global Heap memory. In an unmanaged global heap, it is your program’s responsibility to manage it. Therefore, any leaked memory will eventually resource starve and crash your program sooner or later.
I can list several such cases, but let me address something before we venture any further. Every program has multiple function stack frames and one global heap. Global heap is basically OS managed virtual memory. A stack is a LIFO (Last in First Out) limited size data structure. As you can imagine, any overflow onto the neighboring stack will effectively wipe your saved data and crash your program.
Remember this. No code looks and behaves as disastrously badly as the one ridden with the smart pointers written by a person who has no clue what a pointer actually is or what it represents. Please read all the cases below before you ever attempt to indulge your program with the smart pointers. Also, go back up and reread the section “Dangling Pointer”. Smart pointers are notorious for creating such substrata when attachment/detachment of the raw pointers to smart pointers is handled through the arse instead of brain cells.
A stack is a LIFO data structure. As a matter of fact, a compiled program is also known as a “Stack Computer”. Each process and thread gets its own stack. Each stack is subdivided into the call stacks. The number of call stacks exactly matches the number of functions in your program. Call stacks are smaller chunks of the stack. Call stack size has a limit, usually 1Mb. On UNIX systems, it is an environment variable (I believe). Visual C++ compiler allows you to alter call stack size with a /F
flag.

Stack overflow is best characterized by the following pseudocode of the _chkstk()
function:
;***
;_chkstk - check stack upon procedure entry
;
;Purpose:
; Provide stack checking on procedure entry. Method is to simply probe
; each page of memory required for the stack in descending order. This
; causes the necessary pages of memory to be allocated via the guard
; page scheme, if possible. In the event of failure, the OS raises the
; _XCPT_UNABLE_TO_GROW_STACK exception.
;
; NOTE: Currently, the (EAX < _PAGESIZE_) code path falls through
; to the "lastpage" label of the (EAX >= _PAGESIZE_) code path. This
; is small; a minor speed optimization would be to special case
; this up top. This would avoid the painful save/restore of
; ecx and would shorten the code path by 4-6 instructions.
;
;Entry:
; EAX = size of local frame
;
;Exit:
; ESP = new stackframe, if successful
;
;Uses:
; EAX
;
;Exceptions:
; _XCPT_GUARD_PAGE_VIOLATION - May be raised on a page probe. NEVER TRAP
; THIS!!!! It is used by the OS to grow the
; stack on demand.
; _XCPT_UNABLE_TO_GROW_STACK - The stack cannot be grown. More precisely,
; the attempt by the OS memory manager to
; allocate another guard page in response
; to a _XCPT_GUARD_PAGE_VIOLATION has
; failed.
;
;*******************************************************************************
Because the stack size is limited, any spill over onto the neighboring stack is watched for and if this happens, an exception is thrown. A _PAGESIZE_
is 4Kb on 32 bit OS and 8Kb on 64 bit OS, so if any variable size is greater than the page size, it is checked, but not necessarily results in the stack overflow.
Alloca()
function pulls the memory off the local stack during runtime.
The object size exceeds the function stack size. This limitation requires the usage of the global heap and thus the pointers if the object is just too big or can grow big over time. Everyone at least once ran across the stack overflow exception or segmentation fault.
What can it be? To name a few:
- Character array that reads > 1Mb file (or greater than the default stack size) that were declared on a stack:
char file_readin[2000000]; ...
- A C++ objects with nested classes and arrays whose cumulative sizeof > 1Mb (or greater than the default stack size):
class CGiganticClass{…};
CGiganticClass a;
- A recursive function calls several thousand calls deep. Well in that case, even pointers are not going to save you but may delay the inevitable:
void recursive_function(int value)
{
char file_readin[200];
for(int i = 0; i < 100000; i++)
{
recursive_function(i);
}
}
- Collection implementation:
template class futile_array<class T>
{
T arr[1000]; size_t avail_index; public:
void add(const T val)
{
if(avail_index >= 1000)
return;
arr[avail_index++] = val;
}
};
void main()
{
futile_array<CGiganticClass> a;
}
- The subsystem used is not available during compile time and exists only during runtime thus returning pointers only (Windows API, DirectX API, Linux sys calls, Open GL, etc.):
void main()
{
void* ptr = ::SomeOperatingSysAPI();
SomeStruct* p = (SomeStruct*)ptr;
}
- Any other sort of runtime creation or destruction.
If you are on the embedded system, things are far worse. There, you’ll be lucky to get a 4Kb stack size or perhaps even less.
Controlled creation. By declaring a pointer, no object is created. And the respective pointer occupies only 4bytes (a 32-bit pointer) or 8bytes (a 64 bit pointer) in the memory even if it points to the larger sized object. You may want to create that object only when it is absolutely necessary during runtime and under certain conditions, but not in any other case, for example. And therefore, controlling your program’s memory consumption.
Opposite of case 2. Controlled destruction. Automatic variables are destroyed after the exit of the function in the opposite order they were declared. This applies to the main function as well. And sometimes, it is necessary to do a cleanup prior to the function’s end. You may free some resources, unload dynamic libraries, and so on. Imagine a class that for example, contains pointers to the DLL objects and this class were declared as a stack variable. So you unload that DLL before the function returns and your class destructor call. And if your class destructor happened to do a cleanup duties on that same DLL, your program will crash. It is required that such object must be destroyed before any function returns and before freeing that DLL. Thus you need to control the time of that object’s destruction.
I know ya’ll smart and you would argue that this is possible to achieve with the nested braces within the function itself to control the destruction of an automatic variable. Yes, but only if that variable is not needed outside of those braces, and besides it just looks ugly.
Whatever the underlying case is, the most important item to remember is that you can physically control the death of an object either in time or under specific circumstances of your choosing.
Sidetrack: BTW smart pointers do not afford you the “controlled destruction” power. They literally hack your pointer into a ‘stack variable’ that are reference counted with each copy and copied all over the place. You can control pointer lifetime only if a smart pointer provides a call to delete an underlying pointer, but that is even uglier and stealthier than the call to the delete operator itself. My opinion anyway.
Also, weak pointers are the observers of the strong smart pointers and the strong smart pointer internally actually carries an array of each weak pointer copies so they can be notified of the underlying strong pointer demise. Each weak pointer instance adds to the array within a strong pointer and can actually amount to a significant size internally that is far worse performance wise versus normal raw pointer.
I do not advocate against them, but you must know what the cost is and whether it is worth it.
A collection of C or C++ objects whose individual size is so big that any copying or movement of them around unnecessarily will fragment the heap so badly that any consecutive calls to the operator new (or malloc) will eventually result in the out of memory exception. What can this possibly be? A database engine implementation, a game that holds 50,000 objects in the scene that holds mesh and texture plus other data, to name a few. In this case, after these objects were allocated, it would be wise to avoid any copying or movement of such objects. They must remain where they were originally created and any operation upon them such as function call value, sorting, etc. must be done via pointer only and not the object itself. For example, you can sort object pointers based on some object criteria without the reshuffling an objects in the memory itself. Such objects are stored in collections by the pointer. So the std::vector<CGiganticClass*>
stored as a pointer and is sorted by the pointer instead of the actual class. This may be a bit confusing situation that can be cleared by an example. Any industrial strength collection such as std::vector
internally uses a heap to create a type. However, it allocates on the heap precisely a space for ‘a type’, an object itself or a pointer to an object. In the case of declaring a vector to a pointer, no objects are ever created, only the pointers to them. So you can actually allocate them later. Or if they were already created/allocated, the addition of the pointer to a collection will not result in the object copy or movement itself. Only copy and movement by the pointers which are only 4 or 8 bytes wide.
As mentioned above in case 4, storing an object by the pointer affords yet another fantastic advantage. Not only can you store the pointers in an array but all at the same time, you can store these very same pointers in a std::map
, std::unordered_map
, std::list
, std::hashmap
simultaneously, without ever creating the unnecessary object copies. You may want to access it by index or by key value or any other efficient search schema and get a pointer to what you seek while the main object remains static in the heap memory. This makes even data heavy programs incredibly fast and responsive.
Asynchronous execution parameters or better known as threading parameters, that are passed by the pointer that are actually stack auto variables inside the invoking function. Well, any attempt to pass a stack variable into an independent thread by address is the key to an inevitable disaster. Because when the calling function exits the variable is destroyed and the corresponding thread pointer will be looking and accessing the no man’s land. This also applies to bad use of smart pointers that deleted the object after function exit and leaving the thread pointer in a dangling state.
A pointer is a stateful object. Not only it can accesses an object itself but it holds the information on whether the object exists or not, all at the same time
Declaration conventions. The object pointers point into the data segment while function pointer points into the code segment. Also, function pointers don’t need to be dynamically allocated or deleted.
You say what? Everyone knows that! Why even bother to cover this topic. Well, well, go and check the diagram of the stack section – parameters. Every parameter is pushed on that stack prior to the jump to function’s label and that can be a lot of PSH assembly opcodes depending on the composition of that object. And yet another reason why arrays are always passed by a pointer even if they don’t explicitly specify that. BTW this only applies to C arrays, if you passing something like a std::vector
, it will be passed by value or copied. If you need to count how many CPU cycles that function jump takes, this depends on how large the object is. Because it a complete and independent copy, that push can also cause the stack overflow. Sometimes, you have no choice but pass by value. But when you don’t have to, passing by reference or pointer is hundreds times faster because it is just one 4 or 8 bytes push instead of hundreds of pushes.
It affords you to pass a huge object into a function with one PSH opcode and grant access to it as if it was not a pointer. But that comes with another cost. First, you can pass in by accident a dereferenced null pointer and your function will crash. And the second there is absolutely no way to check if the reference is any good because reference by its nature is stateless. Which brings up an interesting point that if the object is a pointer to dynamically allocated memory then just don’t pass it in by the reference (dereference), pass it by the pointer instead.
One thing is worth knowing that a 32-bit operating system can address the entire 4 GB of address space per each process even if your computer has only 1 GB of physical memory installed. This kind of magic performed via the disk file system, so whatever does not fit into physical RAM is written onto the disk or “paged” (making it somewhat dependent on the disk read/write speed). In fact, each process gets its very own individual 4 GB address space (virtual that is) regardless of your hardware RAM capacity (to a certain point). Not only that, but each process also gets its very own virtual processor. The operating system switches processor values when it switches the process/thread context itself. This is very important that the processor registers hold the pointers to the current process memory and not to the neighboring process memory. Any crashes in one process will not harm any other process. This was not the case in the old 16 bit systems like MSDOS or Windows 1.0, 2.0, 3.0, and 3.1. Those operating systems operated on the physical hardware metal itself and one crashed program could effectively crash any other program and the OS itself.
- 16th January, 2020: Initial article
- Feb 18, 2020. Changed wording 'dynamically allocated pointer' to 'pointer to dynamically allocated'
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.