An Image Processing Application in C++

Darryl Bryk

4.98/5 (35 votes)

Jun 24, 2014

CPOL

11 min read

125817

8065

Code is described for a multi-document interface (MDI) image processing application utilizing the CImage class in C++

Download source - 255.1 KB

Introduction

Some basic image processing functions involve manipulation of the pixels using filters or histogram based functions that modify the pixel distribution. Some of these enhance the image’s display in various ways or remove noise. This article will describe C++ code that was developed for a MFC multi-document interface (MDI) image processing application.

Background

The application, called Imagr (spelled without an “e” for historical reasons), is built as a MFC multi-document interface and utilizes the Microsoft ATL CImage class, since it already has the built-in capability for opening and saving images in the most popular formats (bmp, jpg, tif, gif, and png). Imagr also has the capability of opening pcx type images (thanks to Roger Evans for the original code), certain ASCII text images, and certain “raw” binary type images. The CImage member function CImage::Load reads the image file into a bottom-up (origin at lower-left corner of image) device-independent bitmap (DIB). However, this can be inconvenient for accessing pixels, so the image is reformatted into a top-down (origin at top-left corner) DIB and 32 bit (since most display adapters are now true color). See Convert32Bit() code.

Once in top-down mode, pixel manipulations are made much simpler. CImage::GetBits() can be used to return a pointer to the first pixel (top-left pixel) and then subsequent pixels are accessed by simply incrementing the pointer in a single for-loop. Images are categorized into one of three pixel types: grey scale, color, and integer (sometimes called raw, these were derived from a special data acquisition process). Images that are only grey scale are handled faster than color images which have three color channels. Raw integer images can have more bit depth for processing, but must be reduced to 8 bit (values 0…255) to be displayed.

An important thing to mention is that Microsoft’s Cimage class stores color bits differently than the usual bitmap. Instead of being RGB, its actually BGR (blue, green, red), so if you use the typical GetRValue() function for example, to get the red bits, it will return the blue bits instead. And if you store the red, green, and blue values with the RGB(red, green, blue) macro, the red and blue will be incorrectly displayed. The green bits are the same, just the blue and red are flipped. So in the code (see ImagrDoc.h), the following definitions are used to help keep the colors straight.

#define RED(rgb)    (LOBYTE((rgb) >> 16))
#define GRN(rgb)    (LOBYTE(((WORD)(rgb)) >> 8))
#define BLU(rgb)    (LOBYTE(rgb))
#define BGR(b,g,r)  RGB(b,g,r)

So, for example, RED(p) will return the red bits as expected from the passed pixel p, and BGR(b, g, r) stores the red, green, and blue bytes into the 32 bit pixel for correct display.

MDI Advantages

Designing Imagr as a MDI application offers benefits of being able to compare images or do two-image operations (as explained below). In addition, it’s important to have the capability of making a copy of an image, so for example, you can save the state of a filter operation before trying other filters, or do a two-image operation with a copy of the image or a processed copy of the image. The MDI also does some important background chores. With a call to SetModifiedFlag(), the image’s state of change is maintained, so that if the image is closed, the MDI will automatically prompt the user to save it first. The MDI also enables files to be dragged and dropped into the application, and coordinates which image is the active one with a mouse click.

Using the Code

The application’s code is included as a complete project for use with Microsoft Visual Studio 2008 or 2010.

Histogram Functions

In the context of the current application, a histogram is a graphical representation of the distribution of pixel values in an image. A count of the number of each value of pixel is maintained in a 256 dimensioned array (or 256 x 3 for color images) and graphed to a dialog window. (Note: In the Imagr code, you will see that pixels < 0 and > 255 are also shown in the histogram in order to handle the raw type of full integer imagery). A grey scale image and its associated histogram is shown below. As can be seen in the histogram, the majority of pixels in this image lie in the region 32…112.

Fig. 1 - Grey scale image with histogram

An example of a color image histogram (below) for a color image shows the red, green, and blue channel distributions.

Fig. 2 - Full color image histogram

With a normalization function (also known as contrast stretching), the histogram curve can be stretched or compressed to the desired range. Usually, this is done to expand the pixel range to the full range of intensities (0 to 255) giving a more evenly contrasted image. The figure below shows an example of a low contrast image before and after normalizing with the associated histograms. Normalization effectively redistributes the histogram without appreciably altering the histogram curve.

Fig. 3 - Image with histograms before (top) and after normalizing (bottom)

The normalize algorithm that operates on every pixel in the image is:

pixel = (pixel - min)*(nmax - nmin) / (max - min) + nmin

where max and min are the starting maximum and minimum pixel values in the image, and nmax and nmin are the new maximum and new minimum pixel values chosen to normalize to.

The normalization code is shown below for handling the three types of image pixels (grey scale, color, and “raw” integers). The RED() macro is called in the grayscale code to isolate just the lower byte of the integer pixel, which is not really red in this case (for clarity, I should probably have made a GREY() macro which does the same thing). This function is called from a dual slider class (thanks to includeh10, CodeProject article “A Slider with Two Buttons”, 9 Aug 2006). The normalize function is tied to the dual sliders so the image can be normalized as you adjust the sliders in near real-time (depending on the image size and system speed). The image is copied to an “undo” buffer before each slider adjustment so that the function operates on the same starting image each time the slider is changed. The sliders control the nmin and nmax variables passed to the function.

/*----------------------------------------------------------------------
  This function normalizes the bitmap to the passed range. 
  (For calling interactively from the slider dialog.)
----------------------------------------------------------------------*/
void CImagrDoc::Nrmlz(int nmin, int nmax)
{
    int d;
    float factor;
    byte r = 0, g = 0, b = 0;

    OnDo();        // Save prev. image for Undo

    int *min = &(m_image.minmax.min);
    int *max = &(m_image.minmax.max);

    if (*max - *min == 0)
        factor = 32767.;    // Avoid div. by 0
    else
        factor = (float)((float)(nmax - nmin) / (*max - *min));
    
    int *p = (int *) m_image.GetBits();    // Ptr to bitmap
    unsigned long n = GetImageSize();

    switch (m_image.ptype) {
        case GREY:    // Grey scale pixels
            for ( ; n > 0; n--, p++) {
                r = RED(*p);
                d = (int)((float)(r - *min) * factor + nmin + 0.5);
                r = (byte)THRESH(d);
                *p = BGR(r, r, r);
            }
            break;
        case cRGB:    // RGB color pixels
            for ( ; n > 0; n--, p++) {
                r = RED(*p);
                d = (int)((float)(r - *min) * factor + nmin + 0.5);
                r = (byte)THRESH(d);

                g = GRN(*p);
                d = (int)((float)(g - *min) * factor + nmin + 0.5);
                g = (byte)THRESH(d);

                b = BLU(*p);
                d = (int)((float)(b - *min) * factor + nmin + 0.5);
                b = (byte)THRESH(d);
                
                *p = BGR(b, g, r);
            }
            break;
        default:    // INTG
            for ( ; n > 0; n--, p++) {
                r = (int)((float)(*p - *min) * factor + nmin + 0.5);
                *p = BGR(r, r, r);
            }
            m_image.ptype = GREY;    // Changed type
            break;
    }

    *min = nmin;
    *max = nmax;
    UpdateAllViews(NULL, ID_SBR_IMAGEMINMAX);    
}

Note that this normalization is restricted to operating on the minimum to maximum pixel range. Sometimes, one may want to expand or contract a narrow range of the histogram. Imagr includes this functionality in the NrmlzRange() function with the following algorithm:

pixel = (pixel - rmin)*255 / (rmax - rmin)

where rmax to rmin is the chosen range of pixel values. A dual slider is also used to adjust the rmin and rmax variables. This allows one to select a range of the histogram to be normalized to the range of 0 to 255. This process may force pixel values to less than 0 or greater than 255, which is outside the displayable range. Pixels with values outside the 0 to 255 range get thresheld to the range so they can be displayed. Any pixels < 0 are set equal to 0, and any pixels > 255 are set equal to 255. The image and histogram below demonstrate this. The image has higher contrast over its full range at the cost of thresholding done at the outer ends of the histogram.

Fig. 4 - Range normalization

Another popular histogram function “equalizes” or flattens the pixel distribution giving more equal contrast among the entire range of pixels as shown below (thanks to Frank Hoogterp and Steven Caito for the original Fortran code). As can be seen, equalizing may make the image “blotchy” and effectively lose resolution, but may be useful for some images. Equalize uses the histogram array to redistribute the pixels. (See Eqliz() code for details).

Fig. 5 - Equalization

Imagr also has a threshold menu function tied to the dual slider so it can be used to limit the range of pixels in the histogram and observe the results in near real-time. This can be useful for selectively cutting off specified intensities in an image.

Image Processing Filters

There are many filters that can be applied for different image processing functions. The convolve function (Convl.cpp) is used to apply 3 x 3 kernel (matrix) filters to the image. Imagr currently has menu options for many types of filters including: low pass, high pass, Sobel, Prewitt, Frei-Chen, various edge enhancing and Laplacian filters, emboss filters, and a kernel input dialog so the user can experiment with their own 3 x 3 kernel. The details of what these filters do can be found on the Internet so that won’t be discussed here. Some of these filters were applied to the image from fig. 1 as shown below.

Fig. 6 - High pass Fig. 7 - Sobel Fig. 8 - Edge enhance

Fig. 9 - Emboss Fig. 10 - Laplacian sharp

The convolve equation looks like this:

P₅ = ∑_i=1…9(K_i * P_i) / ∑_i=1…9(K_i)

where P₅ = center of a 3 x 3 pixel region, P_i = each of the nine pixels, and K_i = each of the nine kernel values. The middle pixel gets changed to the sum of the product of each of its neighboring 3 x 3 pixels (including itself) and the respective kernel values, divided by the sum of the kernel values. This operation is applied to every pixel in the image.

The high pass kernel looks like this:

-1.0, -1.0, -1.0
-1.0,  9.0, -1.0
-1.0, -1.0, -1.0

This basically takes the sum of the inverse of the surrounding pixels plus the center pixel weighted higher (* 9). See ImagrDoc.h for the other kernels used in Imagr.

A portion of the convolution filter code is shown below, but with just the grey scale section for brevity (see Convl.cpp for full code). The 3 x 3 kernels are passed to the function.

/*----------------------------------------------------------------------
  This function performs a 3 x 3 convolution on the active image. The 
  kernel array is passed externally. Edges are added (doubly weighted)
  for the computation. 
----------------------------------------------------------------------*/
void CImagrDoc::Convl(float k1, float k2, float k3,
                       float k4, float k5, float k6,
                       float k7, float k8, float k9)
{
    int *p;                        /* Image ptr */
    unsigned long i, j, nx, ny;
    int *m1, *m2, *m3;            // Pointers to buffers to free()
    int *old_r1, *r1, *r2, *r3; /* Cycling pointers to rows */
    float s, fsum;
    int t;
    byte r, g, b;

    nx = m_image.GetWidth();
    ny = m_image.GetHeight();
    p = (int *) m_image.GetBits();    // Ptr to bitmap

    /* Allocate row buffers */
    if (!(m1 = (int *) malloc((nx+2) * sizeof(*m1)))) {
        fMessageBox("Error - " __FUNCTION__, MB_ICONERROR, "malloc() error m1");
        return;
    }
    if (!(m2 = (int *) malloc((nx+2) * sizeof(*m2)))) {
        fMessageBox("Error - " __FUNCTION__, MB_ICONERROR, "malloc() error m2");
        free(m1);
        return;
    }
    if (!(m3 = (int *) malloc((nx+2) * sizeof(*m3)))) {
        fMessageBox("Error - " __FUNCTION__, MB_ICONERROR, "malloc() error m3");
        free(m1);
        free(m2);
        return;
    }
    r1 = m1;
    r2 = m2;
    r3 = m3;

    // Initialize rows
    memcpy_s(&r1[1], nx * sizeof(int), p, nx * sizeof(int));
    r1[0] = r1[1];                      /* Doubly weight edges */
    r1[nx+1] = r1[nx];

    /* Start r2 = r1 (doubly weight 1st row) */
    memcpy_s(r2, (nx+2) * sizeof(int), r1, (nx+2) * sizeof(int));

    // Calc. sum of kernel
    fsum = k1 + k2 + k3 + k4 + k5 + k6 + k7 + k8 + k9;
    if (fsum == 0) 
        fsum = 1;            // Avoid div. by 0
    else
        fsum = 1/fsum;        // Invert so can mult. 

    OnDo();        // Save image for Undo

    BeginWaitCursor(); 
    switch (m_image.ptype) {
        case GREY:
            for (j = 1; j <= ny; j++, p += nx) {
                if (j == ny) {                /* Last row */
                    r3 = r2;                /* Last row doubly weighted */
                }
                else {     /* Read next row (into the 3rd row) */
                    memcpy_s(&r3[1], nx * sizeof(int), p + nx, nx * sizeof(int));
                    r3[0] = r3[1];            /* Doubly weight edges */
                    r3[nx+1] = r3[nx];
                }

                for (i = 0; i < nx; i++) {
                    s = k1 * (float)RED(r1[i]) 
                      + k2 * (float)RED(r1[i+1])
                      + k3 * (float)RED(r1[i+2]) 
                      + k4 * (float)RED(r2[i])
                      + k5 * (float)RED(r2[i+1])
                      + k6 * (float)RED(r2[i+2])
                      + k7 * (float)RED(r3[i])
                      + k8 * (float)RED(r3[i+1])
                      + k9 * (float)RED(r3[i+2]);

                    t = NINT(s * fsum);
                    r = (byte)THRESH(t);

                    p[i] = RGB(r, r, r);      
                }

                /* Cycle row pointers */
                old_r1 = r1;    // To save addr. for r3
                r1 = r2;
                r2 = r3;
                r3 = old_r1;
            }
            break;
    }
    EndWaitCursor();

    free(m1);                   
    free(m2);
    free(m3);                

    ChkData();                // Re-check range
    SetModifiedFlag(true);    // Set flag
    UpdateAllViews(NULL);    // Still needed even though called by ChkData()
}

The convolution function code (thanks again to Frank Hoogterp and Steven Caito for the original Fortran code) accesses three rows of the image at a time by storing them in three arrays. Pointers to the row arrays are maintained for ease in shifting rows up as the new row is loaded in from the image. For example, when operating on a new row of the image, pointers to array rows two and three (r2 and r3) are set to point to r1 and r2, respectively. So only row three needs to be updated with pixels from the image and this becomes the new r3 (previous pointer to r1).

The edge rows and columns are handled by doubly weighting them. For example, when operating on a pixel in the 1^st row, the 1^st row is copied into the 2^nd row array in order to still have 3 rows (the 3^rd row array contains the 2^nd row) for processing. So it’s as if the 1^st row was copied above the actual 1^st row. Vertical edges (columns) are handled similarly by replicating an additional pixel at the beginning and end of the row.

Two-Image Processing

The two-image functions take two images as input and produce a third image. Some of the operations are: add, subtract, multiply, divide, average, minimum, maximum, and the logical bit-wise functions OR, AND, and XOR. For example, the add function takes a pixel at position (x, y) from image A and adds it to the corresponding pixel (x, y) in image B, and stores this sum in the corresponding pixel (x, y) in image C. This operation is done on every pixel in the image. The current version of Imagr only allows same size images for these operations. The subtraction operation yields an image made up of only the differences between both input images. Subtraction is sometimes done after an edge enhancement to show edge effects overlaid onto the original image. The same two images below show the effects of addition and subtraction.

Fig. 11 - Image addition

Fig. 12 - Image subtraction

Undo Stack

As mentioned briefly above, Imagr also has undo capability. This is very important for an image processing application, since many functions may do undesirable things to an image and it’s important to have an easy way to return to the previous state and try other things. The OnDo and UnDo functions (in file Undo.cpp) push and pop, respectively, the image from a memory stack, implemented by a linked list. When a change is going to be made to an image, OnDo is called first to save its current state.

The Undo data structure looks like this:

struct Undo_type {        // Linked list of image buffers for undo
    int *p;                // Ptr to pixel buffer
    BOOL mod;            // Doc. modified status
    int ptype;            // Pixel type needed in case changed
    char hint[80];        // OnDo() caller's hint for OnUndo()
    Undo_type *next;    // Ptr to next node
};

The variable “mod” maintains the images’ saved state, so when popping the image off the stack, a call to SetModifiedFlag() will restore the “saved” state. The “hint” string’s purpose is to keep track of certain operations that may be reversed without requiring a complete memory save of the image. For example, if an image is rotated 90 degrees, a hint could be used by the UnDo function so that it knows just to do a rotate of -90 degrees to restore the image. Therefore all the image pixels don’t need to be saved in memory. Although this undo capability isn’t currently implemented in Imagr, it can provide faster functioning and save memory resources.

As mentioned above, the undo capability is used by some functions that utilize the slider dialogs to quickly OnDo (push) and UnDo (pop) the image as the sliders are manipulated. This gives an interactive capability to see in near real-time what the slider operation does to the image.

Conclusion

In conclusion, some basic image processing code for histograms, convolution filters, and two-image operations was discussed. Although there are a lot of image processing applications out there, developing your own can be very worthwhile to have custom functionality. Imagr contains other image processing functions which have not been discussed here. See my previous article in CodeProject “Drawing an Image as a 3-D Surface”, July 2011, which discusses Imagr’s 3D graphing capabilities.