Click here to Skip to main content
Click here to Skip to main content

Screen Scraper in Managed Code

By , 7 Apr 2008
 

ScreenScraper

Introduction

Finding a sub-image within a large image, such as the screen, can be a time consuming task. The screen on my laptop contains over a million pixels, and to ensure that an image is found within the screen, it is necessary to examine most of them, possibly more than once.

My goal when writing this article was to write a program that could efficiently find all the occurrences of a sub-image within the screen image, using only MFC and managed code. In the end, I resorted to using unsafe pointers in C# to boost performance, but operational speed was one of my primary criteria.

Background

Screen scraping is an idea that has been around for quite some time. Originally, I believe it was used to refer to the idea of extracting data from web pages by examining the HTML code. A more recent idea is using technologies such as OCR and image processing to extract useful information from other forms of media. The problem of sub-image location that is solved here could have many applications in the areas of search, gaming, AI, and others.

Using the code

The algorithm I adopted first finds a pixel, within the sub-image that is being searched for, that has an infrequently occurring ARGB value. This pixel value, since it is relatively rare, should be more indicative of an occurrence of the sub-image.

private void Init(Bitmap bmImage)
{
    image = new int[bmImage.Width, bmImage.Height];
    Hashtable repeats = new Hashtable();

    for (int x = 0; x < bmImage.Width; x++)
    {
        for (int y = 0; y < bmImage.Height; y++)
        {
            image[x, y] = bmImage.GetPixel(x, y).ToArgb();

            // The pixle value has been found before
            if (!repeats.ContainsKey(image[x, y]))
            {
                // a = {number times found, x location, y location}
                int[] a = { 1, x, y };
                repeats.Add(image[x, y], a);
            }
            else
            {
                // Increment the number of times the values been found
                ((int[])repeats[image[x, y]])[0]++;
            }
        }
    }

    // Find the pixel value that has been found the least number of times
    int min = int.MaxValue, ix = -1, iy = -1;
    foreach (DictionaryEntry de in repeats)
    {
        int[] a = (int[])de.Value;
        if (a[0] < min)
        {
            min = a[0];
            ix = a[1];
            iy = a[2];
        }
    }

    pixels.Add(image[ix, iy], new Point(ix, iy));
}

To find the least frequent pixel value, a hash table is populated with each pixel value that occurs in the image, along with the number of times that it occurs. Then, it is a simple matter of iterating through the table to find the value with the minimum number of occurrences. There are two global data structures being used here. The first is image, which a 2D array containing the pixel values of the sub-image, and the second is the Hashtable, pixels, which, in the last line of code, is being populated with the least frequently occurring pixel value as a key and the location of that pixel as a value.

The first challenge encountered when actually writing the code was capturing the screen image without falling back on the Win32 API. To do this, I resorted to using the SendKeys method to activate the PrintScreen button and then grab the resulting screen image off of the Clipboard. Of course, this has the downside of clearing whatever was on the Clipboard before. When trying to fix the undesirable clearing behavior, I managed to obtain a disconnected context error in relation to COM objects, which was a first for me.

private static Bitmap getDesktopBitmap()
{
    SendKeys.SendWait("^{PRTSC}");
    Bitmap bm = new Bitmap(Clipboard.GetImage());
    Clipboard.Clear();
    return bm;
}

Once the screen image has been captured, all that remains to be done is to find if there are any occurrences of the sub-image within the screen image.

public List<Point> findImages()
{
    Bitmap bm = getDesktopBitmap();
    BitmapData bmd = bm.LockBits(new Rectangle(0, 0, bm.Width, bm.Height),
                                ImageLockMode.ReadOnly, bm.PixelFormat);
    List<Point> results = new List<Point>();
    foundRects = new List<Rectangle>();

    for (int y = 0; y < bmd.Height; y++)
    {
        byte* scanline = (byte*)bmd.Scan0 + (y * bmd.Stride);

        for (int x = 0; x < bmd.Width; x++)
        {
            int xo = x * PIXLESIZE;
            byte[] buff = { scanline[xo], scanline[xo + 1], 
                            scanline[xo + 2], 0xff };
            int val = BitConverter.ToInt32(buff, 0);

            // Pixle value from subimage in desktop image
            if (pixels.ContainsKey(val) && notFound(x, y))
            {
                Point loc = (Point)pixels[val];

                int sx = x - loc.X; 
                int sy = y - loc.Y;
                // Subimage occurs in desktop image 
                if (imageThere(bmd, sx, sy))
                {
                    Point p = new Point(x - loc.X, y - loc.Y);
                    results.Add(p);
                    foundRects.Add(new Rectangle(x, y, bmImage.Width, 
                                                       bmImage.Height));
                }
            }
        }
    }

    return results;
}

private bool imageThere(BitmapData bmd, int sx, int sy)
{
    int ix;

    for (int iy = 0; iy < bmImage.Height; iy++)
    {
        // Horizontal line of pixles in the bitmap data
        byte* scanline = (byte*)bmd.Scan0 + ((sy + iy) * bmd.Stride);

        for (ix = 0; ix < bmImage.Width; ix++)
        {
            // Offset into the scan line
            int xo = (sx + ix) * PIXLESIZE;
            // Convert PixelFormat.Format24bppRgb
            // to PixelFormat.Format32bppArgb
            byte[] buff = { scanline[xo], scanline[xo + 1], 
                            scanline[xo + 2], 0xff };
            // Pixle value
            int val = BitConverter.ToInt32(buff, 0);

            if (val != image[ix, iy])
                return false;
        }
        ix = 0;
    }

    return true;
}

private bool notFound(int x, int y)
{
    Point p = new Point(x, y);
    foreach (Rectangle r in foundRects)
    {
        if (r.Contains(p))
            return false;
    }

    return true;
}

The first step in the process is to lock the bits in the bitmap to obtain the bitmap data. This will allow us to use a pointer into the bitmap to access the pixel values directly, instead of relying on the bitmap functions bm.GetPixel(x, y).ToArgb(): here is where we receive the necessary performance increase.

To obtain a particular pixel value from the bitmap data, a scan line is first determined. A scan line can be thought of as a single horizontal row in the bitmap. As seen in the line:

scanline = (byte*)bmd.Scan0 + (y * bmd.Stride)

the scan line can be determined by taking the byte offset of the first pixel in the bitmap, and adding to it the y position of the scan line (the number of lines it is from the top of the image) multiplied by the number of bytes there is in each scan line. We now have an array of bytes, which represents the y value of the pixel we are trying to find the value of. The x offset into the scan line is simply the number of bytes per pixel times the number of pixels we are looking into the scan line. However, there is a little trouble here. It turns out that using the Print Screen method of capturing the desktop returns a bitmap that uses 32 bits for the RGB values of a pixel, with the last 8 bits being 0xff. Since the image array is populated with the ARGB values of the sub-image bitmap, we must convert from one format to another. This is achieved by the following lines of code:

int xo = x * PIXLESIZE;
byte[] buff = {scanline[xo],scanline[xo + 1],scanline[xo + 2], 0xff};
int val = BitConverter.ToInt32(buff, 0);

All together, this is functionally equivalent to val = bmd.GetPixel(x, y).ToArgb().

So, now, all that is left to do is find if the value of the screen image pixel is the same as the rare sub-image pixel value that we placed in the Hashtable earlier. But, first, we check to see if the x, y location of the screen pixel we are examining is contained within an area of a sub-image we have previously located. If it is, we just move on, to avoid finding the same image more than once. The list of rectangles, foundRects, is used for this purpose, as it contains a rectangle of the same dimensions and location as each sub-image that has been found.

To determine if the sub-image occurs in the screen image, imageThere does a pixel-by-pixel examination, and returns true if all the pixels match up. A single different pixel is taken to mean that the sub-image does not occur, and thus false is returned.

Points of interest

There are a couple of things to keep in mind when using this program:

  • Requires the images loaded are in 32 bit ARGB format
  • The time it takes to run is dependent on the existence of a good unique pixel value
    • On a 2 GHz Athelon with a 15.4'' screen, about .5 sec. for most images
    • About 30 sec. for small white images and white screen background
  • The Print Screen functionality was only tested on Windows XP, and may not work the same on Vista, etc.

History

  • 4/8/2008 - Original article.

License

This article, along with any associated source code and files, is licensed under The GNU General Public License (GPLv3)

About the Author

Chris Gorecki
United States United States
Member
I am currently a software engineer for a company in Seattle Washington.

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
GeneralTranslating to VB.NETmemberCbadboy24 Nov '10 - 5:48 
How do i translate this line of code
byte* scanline = (byte*)bmd.Scan0 + (y * bmd.Stride);
 
I tried
 
Dim scanline(bmd.Width * PIXLESIZE) As Byte
Runtime.InteropServices.Marshal.Copy(bmd.Scan0, scanline, (y * bmd.Stride), bmd.Width * PIXLESIZE)
 
but it fails when y=1 (2nd line) gives of Out of bounds error
 
although the online converters give this back
Dim scanline As Pointer(Of Byte) = CType(bmd.Scan0, Pointer(Of Byte)) + (y * bmd.Stride)
it's not possible in vb.net
GeneralRe: Translating to VB.NETmemberCbadboy24 Nov '10 - 9:03 
Solved the answer for anyone who is porting this to VB.NET and i've finished my port works perfectly!.
 
here is how I represented those pointers
 
        byte* scanline = (byte*)bmd.Scan0 + ((sy + iy) * bmd.Stride);
 
to
 

               Dim scanline((bmd.Width * PIXLESIZE) - 1) As Byte
                Dim scanlinea As IntPtr = New IntPtr(bmd.Scan0.ToInt32 + ((sy + iy) * bmd.Stride))
                'Copy the RGB values into the array.
                Runtime.InteropServices.Marshal.Copy(scanlinea, scanline, 0, bmd.Width * PIXLESIZE)
 
and
 
byte* scanline = (byte*)bmd.Scan0 + (y * bmd.Stride);
 
to
               Dim scanline((bmd.Width * PIXLESIZE) - 1) As Byte
                Dim scanlinea As IntPtr = New IntPtr(bmd.Scan0.ToInt32 + (y * bmd.Stride))
                'Copy the RGB values into the array.
                Runtime.InteropServices.Marshal.Copy(scanlinea, scanline, 0, bmd.Width * PIXLESIZE)

GeneralVillage Idiot can't get the program workingmemberEmile Fraser11 Jan '09 - 22:25 
Hi there,
 
Looks like an awesome app this! I am busy teaching myself C# at the moment, but I am only about 2 weeks into it so please bare with me. My understanding of this program is that you should load the image of the table (screen.png or top.png) and the app should be able to identify certain subimages in the bigger image (eg ID the flop cards in screen.png). But when I load it it says no image found. Any help would be appreciated please?
GeneralRe: Village Idiot can't get the program workingmemberChris Gorecki14 Jan '09 - 15:09 
Hello,
 
You actually want to load the image that you are looking for in the screen. The application captures whatever is on the screen and examines it for the image you have loaded. For example if you go to a web page and download an image it is displaying then you should be able to load it into the application to find the coordinates of that picture on your screen. Please let me know if you have any questions or problems. Thanks!
GeneralRe: Village Idiot can't get the program workingmemberEmile Fraser14 Jan '09 - 23:04 
Hi man,
 
In the infamous words of Homer Simpson: "DOH!" Big Grin | :-D
 
Ok I got it to work, absolutely awesome app!
GeneralError: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.membercharltonw2 Oct '08 - 17:55 
Hi, I'm getting this error for some images I want to search on my desktop. Some other images work, but for some reason, other ones I use do not work (I'm all getting pieces of images from the snapshot of PrintScreen). I'm not sure why, I think its going out of bounds but I'm not too particular on the reason. It seems like xo is out of bounds of scanline but I'm not sure.
 
It occurs in the imageThere function,
 
byte[] buff = { scanline[xo], scanline[xo + 1], scanline[xo + 2], 0xff };
 
Otherwise, the code is pretty good and its so fast with the other images! Thanks.
 
Kind Regards,
 
Charlton
GeneralRe: Error: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.memberChris Gorecki7 Oct '08 - 9:42 
Hi, I'm glad you're using the code. That's odd that xo would be out of bounds. Have you determined the cause?
 
-Chris
GeneralRe: Error: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.memberMilosNikolic4 Dec '08 - 23:43 
To fix this error in Release mode, you must disable code optimization (Project Properties > Build > Optimize code).
 
Milos
QuestionDoesnt seem to workmemberelitewisdom18 Aug '08 - 0:42 
Like the concept but doesnt seem to work for me. I've screen dumped my screen, cropped out certain part and fed it into your app and doesnt find anything. Im i doing something wrong? Did you test this with a certain test case? If so can you make it available? love to see this working, as interested in taking this idea to the next level on the basis that it actually works.
AnswerRe: Doesnt seem to workmemberChris Gorecki20 Aug '08 - 15:02 
You'll probably have to convert your image to ARGB from the RGB format that print screen usually generates.

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web02 | 2.6.130523.1 | Last Updated 7 Apr 2008
Article Copyright 2008 by Chris Gorecki
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid