15,559,959 members
Articles / Programming Languages / Python
Article
Posted 16 Oct 2013

23.7K views
8 bookmarked

# Analyzing C/C++ matrix in the gdb debugger with Python and Numpy

Rate me:
Using the gdb debugger's Python API to analyze and visualize C/C++ arrays in a debugging session.

## Introduction

When debugging code written in Matlab or Python, one can stop at a break point, manipulate the local vector, matrix variables and plot the results. In this article, I will show how to use the python API of the gdb debugger to plot and manipulate C/C++ arrays and vectors while debugging. For example, if we want to plot the eigenvalues of a two dimensional array `mat` during a debugging session:

PHP
`float mat[10][10] = ....;`

Then using the accompanying code, we can create a `numpy` array in python and plot its eigenvalues:

PHP
```(gdb) py
> import gdb_numpy
> import numpy as np
> import matplotlib.pyplot as plt
> mat = gdb_numpy.to_array("mat") #Creates a numpy array that corresponds to the variable mat.
> print mat.shape
(10,10)
> y = np.linalg.eigvalsh(x)
> plt.plot(y)
> plt.show() #This is needed to show the figure, see notes below.```

Gives a plot of the eigenvalues.

### Notes on using matplotlib in gdb

Before proceeding, it is worth pointing out that when using either `matplotlib.pyplot` or `matplotlib.pylab` inside `gdb`, the `show` method has to be called for display a figure. Moreover, `gdb` will not respond to any command until the figure is close.

## Using the code

The accompanying code depends on the python package `numpy` and can create `numpy` arrays from C/C++ pointers, arrays and STL vectors, as well as their nested types, out of the box. For information about how to install and use `numpy`, please visit their website. In this article, we will assume some basic knowledge of the `numpy` package. (Matlab users may be interested in this link).

To install the accompanying code, we can run the `setup.py` script in the folder with the argument `install`. (Type the following in a linux shell or a Windows command prompt.)

`python setup.py install`

When using the code, import the module `gdb_numpy` in the gdb console:

`(gdb) py import gdb_numpy`

To create a `numpy` array from a C/C++ pointer/array/vector type, pass its name as a string to the function `to_array` in the `gdb_numpy` module:

`(gdb) py vec = gdb_numpy.to_array("vec") #vec is now a numpy array.`

If `vec` is a STL vector or a built-in array, this will create a `numpy` array of the appropriate shape. However, if `vec` is a pointer, then the user must supply a second argument indicating its dimensions. For example, if we have:

PHP
`float** mat = ...;`

Then the dimensions must be supplied as a `tuple`.

PHP
```(gdb) py
> mat = gdb_numpy.to_array("mat", (10,10))
> mat = gdb_numpy.to_array("mat") #error: sizes are not provided.
> mat = gdb_numpy.to_array("mat", (10)) #error: Not all sizes are provided.```

Note that even if there is only one dimension, it still has to be passed as a `tuple`:

PHP
```float* vec = ...;
(gdb) py vec = gdb_numpy.to_array("vec", (10))
(gdb) py vec = gdb_numpy.to_array("vec", 10) #error: Dimensions must be passed as tuple.```

The method also support nested types, e.g.:

PHP
```std::vector<std::vector<double> > mat = ...;
> py mat = gdb_numpy.to_array("mat") #mat is a 2D numpy array```

## Background

We will now take a very brief look at some of the gdb-python API (supported after `gdb 7`) that are used in our code. Full details of the Python API in `gdb` can be found in the gdb documentations. Within the gdb console, the python interpreter can be accessed by the command `python` (or `py`), followed by a python command:

```(gdb) py print 1 + 2
3```

If no argument is provided to the command `python`, then the multi-line mode will be entered:

```(gdb) py
> x = 1 + 2
> print x
> end
3```

Variables in the C/C++ program that we are debugging can be accessed in python using the `parse_and_eval` method of the `gdb` module, which is imported automatically when the python interpreter is accessed through `gdb`.

`(gdb) py my_var = gdb.parse_and_eval("my_var")`

The `parse_and_eval` returns an instance of the `gdb.Value` type, which contains information of the C/C++ variable. For example, the name of the C/C++ type can be accessed through the `type` member:

```(gdb) py
> my_array = gdb.parse_and_eval("my_array")
> print my_array.type
> end
double[10]```

Class members can be accessed through the index operator:

```(gdb) py
> my_class = gdb.parse_and_eval("my_class")
> my_data = my_class['data'] #Gives my_class.data```

If the variable is of pointer type, then the indexer can be used to dereference it:

`(gdb) py print my_data[10]`

This covers what we need to extend the accompanying code.

## Extending the code

The module can be extended to accomodate custom container types. This involves deriving from the class `DeRefBase` in the module `deref`. Suppose we have a user defined matrix type and we want to extend the module to work with it:

C++
```template <typename T>
class MyMatrix
{
public:
....
//Index operator.
T& operator()(int i, int j){ return data[i*columns+j]; }
const T& operator()(int i, int j){ return data[i*column+j]; }
private:
//Underlying data
T* data;
//Number of rows
int rows;
//Number of columns
int columns;
}```

The type stores its underlying data in the member `data` so that if `M` is an instance of `MyMatrix`, then `M(i,j)` is given by `*(M.data+i*M.columns+j)`.

First we need to override the `deref` method in `DeRefBase`, which is used for dereferencing the container. This is done by dereferencing the member `data` of the `MyMatrix` instance.

PHP
```#Converts a MyMatrix instance named Mat to a gdb.Value instance in python.
(gdb) py Mat = gdb.parse_and_eval("Mat")
#Gets an gdb.Value instance that corresponds to M.data. (Even though data is a private member)
(gdb) py
> data = Mat['data']
> columns = int(Mat['columns']) #Gets the columns and cast into integer
> print data[i*columns+j] #Gives Mat(i,j)```

The `deref` function is then:

PHP
```def deref(self, val, indices):
data = val['data']
columns = int(val['columns'])
return data[indices[0] * columns + indices[1]]```

So for example, the following dereferences our matrix:

PHP
```#derefMyMat is an instance of the appropriate DeRef class
(gdb) py print derefMyMat.deref(Mat,(i,j)) #Gives Mat(i,j)```

Note that as with `gdb_numpy.to_array`, the method expects a `tuple` or `list`.

Next we need to update and initialize some members of the class. The member `bounds` stores the dimensions of the matrix.

PHP
```#Constructor expected to take 3 variables:
#Mat: gdb.Value instance that represents the matrix
#shape_ind: An integer for internal bookkeeping purpose.
#shape: A tuple or list, for internal use.
def __init__(self, Mat, shape_ind, shape):
...
self.bounds=[Mat['rows'], Mat['columns']]
...```

Here the dimensions of the matrix are obtained from the matrix instance.

If on the other hand, the dimensions are provided by the user, such as in the case of pointers, then the `_get_range_from_shape` method should be used to extract the dimensions from the argument `shape`.

`self._get_range_from_shape(2) #'2' here is the number of dimensions to extract.`

This will correctly initialize the members `shape_ind` and `bounds`.

The other class member that needs updating is `val`. This should be a `gdb.Value` instance that corresponds to an object after dereferencing.

PHP
`self.val = self.deref(Mat,(0,0))`

As the value of `self.val` will not be used, it does not matter what indices we use in the `deref` method, as long as it is a valid index. For example, we can also use:

PHP
```self.val = self.deref(Mat, (self.bounds[0]-1,
self.bounds[1]-1)) #Works as long as the indices are valid.```

Finally, we need to provide a regular expression to identify our class. This should be something that matches the type name of our class, which can be accessed through the `type` member of the corresponding `gdb.Value` instance.

PHP
```(gdb) py my_mat = gdb.parse_and_eval("my_mat")
(gdb) py print my_mat.type
MyMatrix```

So in our case, the pattern can be `^MyMatrix`.

PHP
```class DeRefMyMatrix(DeRefBase):
pattern = re.compile('^MyMatrix')
....```

Summarizing, the python class that we need to write is:

PHP
```class DeRefMyMatrix(DeRefBase):

pattern = re.compile('^MyMatrix')

def __init__(self, Mat, shape_ind, shape):
super(DeRefMyMatrix, self).__init__(Mat, shape_ind, shape)
self.val = self.deref(Mat, [0,0]) #Updates to a dereferenced type
self.bounds=[Mat['rows'], Mat['columns']] #The dimensions of the matrix

def deref(self, val, indices):
data = val['data']
columns = int(val['columns'])
return data[indices[0] * columns + indices[1]]```

To use this class in the `gdb_numpy` module, we need to register it by adding it to the `_container_list` variable in the module.

PHP
`_container_list = [... ,deref.DeRefMyMatrix]`

The `gdb_numpy.to_array` method can now be used with our `MyMatrix` class. It will also automatically support nested types, e.g.:

PHP
```MyMatrix<MyMatrix<double> > 4DTensor = ...;
std::vector<MyMatrix<double> > 3DTensor = ...;
MyMatrix<std::vector<double> > Another3DTensor = ...;```

will all work with the `gdb_numpy.to_array` method.

## History

• Initial submission: 13/10/13.

Written By
United Kingdom
The author works in a manufacturing company specializing in machine learning algorithms development. At work, I use Python and Matlab for testing and developing algorithms and use C/C++ for product integration.