The call stack is a rich source of information for programming, accessible in debugging mode in different development environments like Eclipse or Visual Studio. There is a lot of information available online on how to access the call stack in different IDEs; there is not much written on how to actually make sense of it. In this article I will present a strategy of using the call stack for debugging. I will give an example from my pet project written in C++ and developed in Visual Studio 2012, but I will try to offer a general strategy and conclusions applicable to other programming languages and IDEs too. The information provided here may look trivial to experienced programmers, but it can be vital and hard to find for all, who were not introduced to this technique by more senior co-workers.
If you don't want to go through the details of the example code, feel free to jump directly to the Conclusions at the end of this article outlining this debugging strategy. For as short description on debugging, chek my article on systematic debugging.
Step 1. Hit by a bug -> find where it surfaced
When running a test on my project, the following exception was thrown:
Figure 1. The exception.
The exception clearly states that it was thrown in line 116 in the function
getPixelFromMiddle() of class ColorBox, to be found in ColorBox.cpp. In the first step it is an easy job to find the line of code throwing the exception. After locating this line, put a breakpoint on it. This is the line of code where the bug surfaced, so we will need to use the call stack to dig deeper in the code to find the conflict of state causing it.
If a program crashes without clearly stating where something unexpected happened, use breakpoints and change their location in several cycles of test runs to find the line of code where the crash occured.
Figure 2. The code section where the bug surfaced (exception thrown).
Step 2. View the call stack
After setting the breakpoint at the right location, run the program again in debug mode. Execution is stopped at the breakpoint. Navigate to the call stack window (in VS in the menu Debug -> Window -> Call Stack) and you will see the stacked list of functions having called each other. Our function
ColorBox::getPixelFromMiddle() is at the top of the stack because this is the last function called when program execution was stopped. The function below
ColorBox::getPixelFromMiddle() is the caller function of it and VS marks the line of code where program execution will resume after returning from
Figure 3. The call stack when stopping execution on the line throwing the exception.
The call stack in our example has the following entries:
ColorBox::getPixelFromMiddle() is called by
Schlieren::colorFinalImageAt(), which is called by
Schlieren::generateFinalImage(), which is called by
Accessing the call stack, you can check program execution frozen at a point in time, showing all its inner details.
Step 3. Examine the call stack
Now you can use the power of call stack debugging. There are two different types of information available while debugging with the call stack. First there is the program execution sequence, where you can check the actual business logic of the application against what you intended it to do (when debugging your own code). The second piece of vital information is the program state, that is, the values held by all variables and objects alive at the moment when program execution has reached the breakpoint. Use your favourite techniques available in your editor to check the value of variables of interest (in VS by hovering with the mouse above them or using the Autos, Locals or Watch window, among others).
Now you have all the information to do some quick and easy investigation and find the conflict of state (the bug), which caused the exception to trigger.
In our example the exception was thrown because the width (w) variable held a value of 65531. w is the current width value when processing an image so it should be within 0 and image width. It is of type uint16_t, so the maximum value is 65535. A value of 65531 is very close to the maximum value of the uint16_t type, and this suggests that we tried to assign the negative value of -5 to this unsigned variable resulting in 65531 (65536-5). Width should be a non-negative number, so our conflict of state is at the line of code where the value of -5 was assigned to w.
Use the call stack to step back to the previous function:
Schlieren::colorFinalImageAt() (click on it on the stack under
When hovering the mouse pointer over
wOffset used to call
getPixelFromMiddle(), we can see indeed that it has the value of -5. We have identified the bug. In the calling code (
colorFinalImageAt()) we call
getPixelFromMiddle(wOffset = -5, hOffset = 4), whereas the signature of it is
getPixelFromMiddle(uint16_t w, uint16_t h). Giving negative values to an unsigned int variable resulted in a conflict of state.
Note: Please note that a bug can be much further away from where it surfaces than shown in our example. We only used this simple but real-life example to show you the strategy of call stack debugging without getting too much into the technicalities of a particular body of code.
Step 4. Solve the conflict of state
Finding the logical inconsistence in the code which resulted in the conflict of state will yield a solution almost automatically. The class
ColorBox holds a square RGB image (member variable
m_colorData) of size
m_size. The member function
getPixelFromMiddle(uint16_t w, uint16_t h) receives a width and a height value. These contain the width and height values offset from the middle of the image. They can therefore hold negative values so they should be held in a signed integer type. Besides changing the type, there is a more subtle change to make: the two function arguments should be renamed to
getPixelFromMiddle(). If we used these names from the start, it would have been obvious that they should be of type signed int.
The code after the bugfix looks as follows:
How to use the call stack to debug a program crash
The program may also crash unexpectedly due to some bug somewhere in the code without throwing an exception. In this case we do not have the luxury of stopping execution where we like it, and the crash often occurs in a standard library not giving much information about the bug. We need to view the call stack in this case too. In Visual Studio click on Ignore (several times if necessary) until you can see the call stack pane. Then click on Break to stop program execution. There will be some functions of the standard library on top, but below those you will have the function calls of the user code and you can follow the debugging steps presented in this article.
To use the call stack for debugging an exception or a program crash, stop the execution of the code right on the line where the exception is thrown. For a program crash simply hit Ignore on the crash then Break the program execution. You can also use breakpoints in several iterations to find the line of code where the bug surfaces.
Access the call stack in your IDE. Check what the sequence of function calls in the call stack reveals about the business logic of the program. The bug can be caused by an issue in the program architecture shown by the stack of function calls. If the business logic is OK, check the program state (value of variables and objects) in the function where the bug surfaced. Step back in the list of calling functions in the call stack and check the program state in the calling functions too. Also check the logic how the program state was created (how the variables got their values). You should quickly find the causes of the conflict of state.