Optical Mark Recognition with DotImage

Lou Franco

0/5 (0 vote)

May 12, 2009

CPOL

7 min read

79604

This article will take you through the steps of recognizing the marks on standard forms using the image processing functionality in DotImage.

Introduction

Standardized tests use bubble sheets to make grading large numbers of test easy. Typically, these are done with a specialized machine that can scan, recognize and grade the sheets, but using Optical Mark Recognition techniques, we can simulate this with a normal scanner and software. This article will take you through the steps of recognizing the marks on standard forms using the image processing functionality in DotImage.

There are five basic steps to OMR.

Design the form
Create a template for the marked areas
Perform 2D transforms on the scanned image to align and size it correctly
Use image processing to accentuate the marks
Find the marks using the template

Design the form

There are many pre-printed OMR forms on the market for you to start from, but the principles are easy if you want to design your own. You have to design a form that is:

Easy for software to quickly align, move and scale so that it can be read
Easy for software to remove uninteresting parts for easier processing

Here’s a simple example (or view/download the template images):

The main thing to do in designing the form is to make it easy to process later. Once we scan the form, we will have to make sure that it is properly aligned and scaled so that the template that we make will match the scanned image.

To help with this, I have placed a half-inch black square in the top margin in the exact center. This will be easy to find later, and all of my coordinates in the template will be based on the location of this square. Other common techniques are to use barcodes or OCR. If you have access to barcode reading software, this is probably the best option since barcodes are designed to be easily recognized and barcode reading software will give you the exact position and size to use in adjusting the document. Barcodes are also more resilient when you cannot guarantee that the scans will be high quality (incidentally, barcode reading is available as an add-on to DotImage at http://www.atalasoft.com/products/dotimage/barcode/).

However, for this example, it’s more instructive to show how you’d go about recognizing the marker yourself – since most preprinted forms use a marker rather than a barcode.

I’ve also used a drop-out color (red) for the bubbles. This will make it easy to completely remove them from the image later, thus making the marks easier to find.

Create a template for the marked areas

The template is simply the locations of areas where you want to find marks. You have to do a couple of things to get this right.

Choose a DPI that you will scan the image. It can be different from this in reality (you will scale), but you need to pick one to take the pixel coordinate positions. I will use 150 DPI.
Choose an origin that you can easily locate later. I am going to use the center of my black square.

At 150 DPI, the center of the square is at (637, 110) and the size of the square is 75 X 75.

Each red circle is about 35 X 35, and the first one is at (155, 374). To get to the next bubble to the right, add 66.5 pixels, and to get to the next bubble down, add 40.75 pixels (rounding in both cases).

This template can be represented as a list of locations, but since it is so regular, we can also represent the template with this code:

static Point _markerStandardCenter = new Point(637, 110);
static Size _markerStandardSize = new Size(75, 75);
static Point _firstAnswerStandardLocation = new Point(155, 374);
static Size _firstAnswerStandardSize = new Size(35, 35);
static float _answerXDistance = 66.5f;
static float _answerYDistance = 40.75f;

Rectangle GetAnswerBubbleRect(int row, int col, Point markerCenter, float scale)
{
    // calculate the location of the first answer on the scaled image
    // first scale the standard offset from the standard center, 
    // then add it to the actual center on this image
    PointF firstAnswerPtScaled =
        new PointF(
         scale * (_firstAnswerStandardLocation.X - _markerStandardCenter.X) + 
           markerCenter.X,
         scale * (_firstAnswerStandardLocation.Y - _markerStandardCenter.Y) + 
           markerCenter.Y);

    // the answer bubble that we want is found by using the distance between 
    // the answers, scaled to this image size. The size of the bubble is the 
    // standard size multiplied by the scale.
    return new Rectangle(
        (int)(firstAnswerPtScaled.X + col * _answerXDistance * scale),
        (int)(firstAnswerPtScaled.Y + row * _answerYDistance * scale), 
        (int)(_firstAnswerStandardSize.Width * scale),
        (int)(_firstAnswerStandardSize.Height * scale)
    );
}

When we find the maker we will determine its center and scale relative to our standard. Given that this function will tell you the rectangle (on your image) of a given answer bubble.

Perform 2D transforms on the scanned image to align and size it correctly

There are three 2D transformations that we need to determine in order to match up our image to the standard one that the template is based on: rotation, translation and scale. The easiest way to rotate the image is to use a deskew algorithm. A deskew algorithm looks at the image and assumes that most of the lines are meant to be at 0 and 90 degrees, and returns the angle that the image is skewed from this.

In DotImage, we can find the angle of skew with this code:

double GetSkewAngle(AtalaImage img)
{
    AutoDeskewCommand cmd = new AutoDeskewCommand();
    cmd.ApplyToAnyPixelFormat = true;
    AutoDeskewResults res = (AutoDeskewResults)cmd.Apply(img);
    return res.SkewAngle;
}

And rotate an image with this code.

AtalaImage RotateImage(AtalaImage img, double angle)
{
    RotateCommand cmd = new RotateCommand(angle, Color.White);
    return cmd.Apply(img).Image;
}

To deskew, get the angle and rotate it the opposite direction:

private AtalaImage Deskew(AtalaImage img)
{
    double angle = GetSkewAngle(img);
    img = RotateImage(img, -angle);
    return img;
}

The next thing you need to do is find the marker. This is an admittedly simplified example – in a real-world scenario, you would want to make it more robust depending on your images. If you couldn’t depend on relatively clean scans, then it would be better to use a barcode or OCR to figure out the true scale and translation of the document.

Here is the code to find a horizontal line segment on a row of an image:

private bool IsDark(Color pixel)
{
    return pixel.GetBrightness() < .05;
}

private bool FindLine(AtalaImage img, int y, int markerSizeThreshold, 
        ref int left, ref int right)
{
    // loop through each pixel in the row looking 
    // for a line of length 'markerSizeThreshold'
    for (int x = 0; x < img.Width; ++x)
    {
        Color pixel = img.GetPixelColor(x, y);
        if (IsDark(pixel))
        {
            if (left == -1)
                left = x;
            right = x;
        }
        else
        {
            if (left != -1 && right - left > markerSizeThreshold)
            {
                return true;
            }
            else
            {
                // wasn't it
                left = -1;
                right = -1;
            }
        }
    }
    return false;
}

This code will be resilient to small specs that are less than markerSizeThreshold long.

Using this, we can find our box by first finding a line-segment and then seeing how many rows have a line-segment in basically the same spot.

private Rectangle FindMarker(AtalaImage img)
{
    int numRowsToSearch = img.Height / 8; // marker is within this area
    int markerSizeThreshold = img.Width / 25; // marker is at least this big
    // eventual marker position
    int top = -1;
    int left = -1;
    int right = -1;
    int bottom = -1;
    
    // try to find the top of the marker, by looping through each row
    for (int y = 0; y < numRowsToSearch; ++y)
    {
        if (FindLine(img, y, markerSizeThreshold, ref left, ref right))
        {
            top = y;
            break;
        }
    }
    if (top == -1)
    {
        throw new Exception("Didn't find marker");
    }

    // find marker extents
    int expectedBottom = top + (right - left) + 10;
    bottom = expectedBottom;
    for (int y = top + 1; y < expectedBottom; ++y)
    {
        int l=-1, r=-1;
        if (FindLine(img, y, markerSizeThreshold, ref l, ref r))
        {
            if (l > right+5 || r < left-5)
            {
                throw new Exception("Marker not found");
            }
            if (l < left) left = l;
            if (r > right) right = r;
        }
        else if (y - top < markerSizeThreshold)
        {
            throw new Exception("Marker not big enough");
        }
        else
        {
            bottom = y;
            break;
        }
    }

    return new Rectangle(left, top, right - left, bottom - top);
}

The rectangle returned by this function gives us the information we need to perform the translation and scale transformations. Since we know the expected size and location of the marker, the ratio of the actual size to the expected size tells us the scale, and the offset from the expected location tells us how to offset our template.

The code to find the scale is:

private float GetImageScale(Rectangle markerActualLocation)
{
    return (((float)markerActualLocation.Width) / 
             ((float)_markerStandardSize.Width) +
           ((float)markerActualLocation.Height) / 
             ((float)_markerStandardSize.Height)) / 2.0f;
}

And the code to find the center of the marker is:

private Point GetMarkerCenter(Rectangle markerActualLocation)
{
    return new Point(
        markerActualLocation.X + markerActualLocation.Width / 2,
        markerActualLocation.Y + markerActualLocation.Height / 2
    );
}

With the scale and position of the center, we now call the GetAnswerBubbleRect() function we wrote earlier.

Use image processing to accentuate the marks

When we designed the form, we deliberately used a drop-out color for the answer bubbles. The reason we did that was so that we could easily find and remove them from our image. Here’s the code:

private AtalaImage DropOut(AtalaImage img, Color color)
{
    ReplaceColorCommand cmdDropOutColor = 
        new ReplaceColorCommand(color, Color.White, .2);
    img = cmdDropOutColor.Apply(img).Image;

    ReplaceColorCommand cmdDropOutNearWhite = new 
        ReplaceColorCommand(Color.White, Color.White, .2);
    img = cmdDropOutNearWhite.Apply(img).Image;

    return img;
}

This function gets rid of any pixels that are near the color passed in or nearly white. In DotImage, the ReplaceColorCommand object automatically replaces one color with another – the third argument to the constructor is a tolerance (between 0 and 1) that indicates how close to the color it needs to be to be replaced.

Find the marks using the template

To find out which bubbles are filled in, we need to loop through all of the bubbles in a column, figure out where they are in the image, and then look at the pixels in that location and see if the bubble looks filled in. Since we have dropped out the red in the area, filled in bubbles will be easy to find. First we need to look at the rectangle over the answer bubble and count up the number of dark pixels:

private bool IsFilledIn(AtalaImage img, Rectangle rect)
{
    // find the number of pixels at each brightness in an area
    Histogram hist = new Histogram(img, rect);
    int[] histResults = hist.GetBrightnessHistogram();

    // count the dark ones
    int numDark = 0;
    for (int h = 0; h < histResults.Length; ++h)
    {
        if (IsDark(Color.FromArgb(h, h, h))) {
            numDark += histResults[h]; 
        }
    }

    // if over a third are dark, then this bubble is filled in
    if (numDark > (rect.Width * rect.Height / 3))
        return true;

    return false;
}

A Histogram object can be used to get statistical information about the pixels in an area. In this case, we are getting a brightness histogram which returns an array with the number of pixels at each level of brightness (0-255) in the rectangular area we pass in. We can use the same IsDark() function we wrote to find the marker. If the number of dark pixels in the answer area is over a third of the total area, we return true to indicate that the bubble is filled in.

To read the answer sheet, we simply need to loop through each column and row and look for filled in bubbles:

private String ReadAnswerBubbles(AtalaImage img, float scale, Point markerCenter)
{
    String name = "";

    // loop through each column, trying to find the letter that is filled in
    int numCols = 15;
    int numRows = 26;
    for (int c = 0; c < numCols; ++c)
    {
        for (int r = 0; r < numRows; ++r)
        {
            Rectangle rect = GetAnswerBubbleRect(r, c, markerCenter, scale);
            if (IsFilledIn(img, rect))
            {
                name += (char)('A' + r);
                break;
            }
        }
    }
    return name;
}

To put all of these steps together use the following function:

private string GetAnswer(AtalaImage img)
{
    // Deskew the image
    img = Deskew(img);

    // find the marker so that we can scale and position the template
    Rectangle markerActualLocation = FindMarker(img);
    float scale = GetImageScale(markerActualLocation);
    Point markerCenter = GetMarkerCenter(markerActualLocation);

    // remove the answer bubbles (that are this shade of red: #D99694)
    img = DropOut(img, Color.FromArgb(0xD9, 0x96, 0x94));

    // read the answer bubbles
    return ReadAnswerBubbles(img, scale, markerCenter);
}

So, now if I take this image:

And call GetAnswer() on it, it will return “ATALASOFT”. I’ve included this image and the template so that you can play around. To get a free evaluation of DotImage, go to http://www.atalasoft.com/products/dotimage.