Screen Captures, Window Captures and Window Icon Captures with Spy++ style Window Finder!

Mark Belles

4.97/5 (86 votes)

Mar 3, 2005

15 min read

330134

12498

Screen capturing that features multiple monitor support, including methods for capturing windows and window icons. Includes Spy++ style window finder!

Sample Image - Screen_Capturing3.jpg

Sample Image - Screen_Capturing1.jpg

Sample Image - Screen_Capturing2.jpg

Introduction

Ever since I can remember, I've been fascinated by the print screen keyboard command. Just exactly how did it capture an image? What strange and powerful API was I missing? Well, after a few months of coding, and getting familiar with the Windows platform, I realized that I could easily code my own screen capture, without relying upon the print screen keyboard command. This article will provide you with the code and knowledge to implement screen capturing in code, without relying upon the print screen keyboard command.

In addition to wonder about the screen captures, there were these other programs that could capture an image of a particular window. That, I thought was quite cool, so I worked that out too. So, while we're capturing images of things on our desktop, I'll also show you how to capture images of any visible window.

One last thing, what about the Windows Task Manager? How does it snag the icon from a window, and then display it for us? Well, I know how, and I'm prepared to show you how easy it is. Interested? Keep reading.

Well, I guess it's not really the last thing. The last thing is that, I've updated the source to include a Spy++ style window finder tool that will allow you to hover over a window to snag it's handle, complete with highlighting, and then to capture that window as an image. A few guys were nice enough to motivate me to do this, and to demonstrate how easy it is!

Getting Started

I'm going cut straight to the heart of the material in this article and hand you the code, so I'm going to assume a few things first. One, that you understand C#, or at least can translate it to another CLR language. Two, that you are at least familiar with the fact that there's a whole lot more behind the scenes called the Win32 APIs, if not I'll show you the way. Three, that you will have a lot of fun playing around with the demo, and code, and hopefully enjoy this article!

Capturing the Desktop as an Image

First up, the desktop. Obviously it can be captured, but how'd they do that? Easy, using just a few API calls and some GDI+, we can recreate the functionality that the print screen command offers, with one advantage. We're not going to touch the Window's Clipboard. Too many times, I have seen improper usage of the clipboard, and in my opinion, slapping a big fat image onto it, erasing whatever you had there before with no warning, is just a bad idea.

The command is 'print screen', not 'erase my clipboard data'. So, why do it? Would you really want to write code that has to use the user's workspace to accomplish it's task? I don't think so. We can capture the image, without storing it on the clipboard and making you paste it into your favorite image editor.

So a little background. Let's start with device contexts. If you are familiar, skip ahead. If not, read on. Device contexts are just generic ways to represent devices, or things you can draw on or interact with graphically. This is hugely over simplified, and I'm going to get flamed, but for the sake of brevity, let's just go with that for now. You can get device context's, or dc's as they are commonly referred to, from images, windows, monitors, and even printers. Every window has them, and you can use them to draw with. Everything you see on your screen is being drawn upon a device context. The desktop, every window, your taskbar, and anything you see. You can draw upon them, or copy what they have drawn upon them. If you can get a hold of them, you can pretty much draw or steal whatever you want graphically speaking. Working with device contexts is fast, and both GDI and GDI+ are based on them.

What's all that mean? Well, in the case of capturing the screen, we know that somewhere Windows is drawing upon a device context, so that we can see it. In fact, there's one for each monitor you have attached to your system, and that desktop that you are seeing on it, is being drawn on that monitor's device context. All we have to do is grab a hold of that device context, create another one of our own, and copy the screen's device context image data to our own, and we've got a screen capture.

Using our knowledge of the .NET framework, and some additional knowledge of the Windows APIs, we can pretty effectively duplicate how the print screen command works. Let's take a look at some code. Here's a method that is used in the example, that will enumerate all of the monitors attached to the system, figure out just how big the combined desktop is, and then copy what each monitor has displayed on it, into a final bitmap image. Instant screen capture. Here's the code.

public static Bitmap GetDesktopWindowCaptureAsBitmap()
{
    Rectangle rcScreen = Rectangle.Empty;
    Screen[] screens = Screen.AllScreens;

 
    // Create a rectangle encompassing all screens...
    foreach(Screen screen in screens)
      rcScreen = Rectangle.Union(rcScreen, screen.Bounds);
     
    // Create a composite bitmap of the size of all screens...
     Bitmap finalBitmap = new Bitmap(rcScreen.Width, rcScreen.Height);
 
    // Get a graphics object for the composite bitmap and initialize it...
     Graphics g = Graphics.FromImage(finalBitmap);
     g.CompositingQuality = System.Drawing.Drawing2D.CompositingQuality.HighSpeed;
     g.FillRectangle(
        SystemBrushes.Desktop,
        0,
        0, 
        rcScreen.Width - rcScreen.X, 
        rcScreen.Height - rcScreen.Y);
     
    // Get an HDC for the composite area...
     IntPtr hdcDestination = g.GetHdc();
 
    // Now, loop through screens, 
    // Blting each to the composite HDC created above...
     foreach(Screen screen in screens)
     {
          // Create DC for each source monitor...
          IntPtr hdcSource = Win32.CreateDC(
            IntPtr.Zero,
            screen.DeviceName,
            IntPtr.Zero,
            IntPtr.Zero);
 
        // Blt the source directly to the composite destination...
        int xDest = screen.Bounds.X - rcScreen.X;
        int yDest = screen.Bounds.Y - rcScreen.Y;
 
          bool success = Win32.StretchBlt(
            hdcDestination,
            xDest,
            yDest, 
            screen.Bounds.Width, 
            screen.Bounds.Height, 
            hdcSource, 
            0, 
            0, 
            screen.Bounds.Width, 
            screen.Bounds.Height, 
            (int)Win32.TernaryRasterOperations.SRCCOPY);
    
        //  System.Diagnostics.Trace.WriteLine(screen.Bounds);
        
        if (!success)
          {
            System.ComponentModel.Win32Exception win32Exception =
                new System.ComponentModel.Win32Exception();     
            System.Diagnostics.Trace.WriteLine(win32Exception);
          }
 
        // Cleanup source HDC...
          Win32.DeleteDC(hdcSource);    
     }
 
    // Cleanup destination HDC and Graphics...
     g.ReleaseHdc(hdcDestination);
     g.Dispose();
    
    // IntPtr hDC = GetDC(IntPtr.Zero);
    // Graphics gDest = Graphics.FromHdc(hDC);
    // gDest.DrawImage(finalBitmap, 0, 0, 640, 480);
    // gDest.Dispose();
    // ReleaseDC(IntPtr.Zero, hDC);
 
    // Return composite bitmap which will become our Form's PictureBox's image...
     return finalBitmap;
}

Looking at the code, the first thing you'll see is that I'm using a mixture of GDI and GDI+. This is due largely to the fact that there is a bug present in GDI+ and the BtBlt API. I have spent many hours on the phone with Microsoft Developer Support to confirm this. This issue only manifests itself on systems with multiple monitors, and if I remember correctly, the system had to have a NVida display adapter on the non-primary monitor, and of course, our old friend Windows 98 running as the OS. What happens is the primary monitor captures fine, the secondary (or any non-primary) monitor stands a chance of returning garbage for an image. It looks like cable channel with no signal. Call it snow, call it ant races, I call it a pain in butt. But that's life, and here's the work around.

Instead of relying on purely managed code, do copy the images, or backing up to the BtBlt API, we instead fall back to it's somewhat slower cousin, StretchBlt. That made me angry when I heard it, but supposedly a fix is in store for the 2.0 framework, so I'll just take my free phone call and wait to see if that's true. In the mean time I need screen captures. You don't have to take my word for it, code it up yourself and when you find out I was right, just remember that I told you so.

Back on the code, first up we just grab all of the monitors using the Screen class' AllScreens property. This does two things for us. First it allows us to figure out how big the entire desktop is, and create an image just big enough to hold all of the screens inside. And secondly, it allows us to figure out just where each monitor is positioned in relation to the other. Remember, with multiple monitor support you can "arrange" your monitors in different ways, and with different resolutions, so don't think in terms of a pure rectangle when you think of how your monitors are positioned.

Once we have those screens, it's a trivial matter to calculate the size of the entire bitmap by using the Rectangle.Union method to build up the size of the overall image. After we've figured out the size of the final image, we'll grab a Graphics object from the image. The GDI+ Graphics object is just the .NET wrapper around a device context. Using that graphics context, we can draw on the bitmap with the graphics object.

Next, we'll enumerate through each monitor, and draw what that monitor has on it's device context, upon the image we just created that will hold the final screen shot. Well draw it using it's coordinates so that in case the monitors have different resolutions or positioning we'll be able to see them as the Display Control Panel applet sees them. Go check it out if you have multiple monitors, and you didn't know you could move them. Chances are there if you have multiple monitors, you know this already, but if not so harm no foul. Open the settings tab and drag one of the monitors around and you'll see you can reposition it in relation to the other monitors.

For each monitor, we'll simply use the StretchBlt API to copy that monitor's device context contents, to the bitmap that will serve as the screen capture of the desktop. Notice that I'm creating a device context each time, this gives us access to that monitor's device context so that we can copy from it. Keep in mind that if we create it, we must destroy it, so we delete the device context when we are finished with it. If you don't, you'll have a memory leak, so keep a watchful eye on your dc's and make sure to release or destroy them. A simple rule is, if you "acquire" it, you're required to "release" it. And if you "create" it, then you must "destroy" it. I quote those because if you look at the GDI APIs, with that in mind you'll find the necessary APIs to do exactly what you want.

Finally, after copying the contents of each device context to that bitmap we created, we'll release the Graphics object we acquired from the bitmap, and dispose it. That's the proper way to clean up a graphics object, if you've acquired a device context from it.

That's it, now we've got a bitmap that any .NET language can use, and we did it without faking keyboard commands, or trashing the contents of the user's clipboard in the process. Try it out, use the print screen command on the keyboard, open an image editor, and paste it into the app. The image was stored on the clipboard. This is ok, but not for us. We're too slick for that.

Capturing a Window as an Image

Ok, so we did the desktop, and we can handle multiple monitors, but what about capturing images from a single window? You've seen the apps that use some sort of zoom tool to identify a window and then let you capture just that window. How's that got down?

Using device contexts of course, and a little guy known as a Window Handle. Every window in the system is identified with a unique number, known as a handle. Windows likes to identify things with "handles", so why should a window be any different, right? Right. Using a window handle, we can figure out the size any window, create a bitmap of that size, snag the window's device context using the same handle, and then copy the window's device context contents to a bitmap. Here's how we can accomplish such a feat.

public static Bitmap GetWindowCaptureAsBitmap(int handle)
{
    IntPtr hWnd = new IntPtr(handle);
     Win32.Rect rc = new Win32.Rect();
     if (!Win32.GetWindowRect(hWnd, ref rc))
          return null;

 
    // create a bitmap from the visible clipping bounds of 
    //the graphics object from the window
     Bitmap bitmap = new Bitmap(rc.Width, rc.Height);
 
    // create a graphics object from the bitmap
     Graphics gfxBitmap = Graphics.FromImage(bitmap);
 
    // get a device context for the bitmap
     IntPtr hdcBitmap = gfxBitmap.GetHdc();
 
    // get a device context for the window
     IntPtr hdcWindow = Win32.GetWindowDC(hWnd); 
    
    // bitblt the window to the bitmap
     Win32.BitBlt(hdcBitmap, 0, 0, rc.Width, rc.Height, 
        hdcWindow, 0, 0, (int)Win32.TernaryRasterOperations.SRCCOPY);
     
    // release the bitmap's device context
     gfxBitmap.ReleaseHdc(hdcBitmap);     

 
    Win32.ReleaseDC(hWnd, hdcWindow);
 
    // dispose of the bitmap's graphics object
     gfxBitmap.Dispose();  
 
    // return the bitmap of the window
     return bitmap;   
}

Well, that wasn't so bad, was it? Nah, I've done things more complicated with for loop statements. Let's see what's going on here. First, let me note that the method takes an int as a parameter type, and then I create an IntPtr from it, this stems from the fact that IntPtrs are serializable, and this source code was taken from a larger project, that just so happened was communicating over a network connection, which required me to make objects that were serializable. Well, I got lazy and made a property that was an int, and a method that took that int property so you're stuck with it.

Once you have the handle to the window you want to capture, we'll first figure out how big it is, you can do this using the GetWindowRect API. Using that rectangle we can create a bitmap just large enough to hold the window's image, using a straight one to one copy. We'll not rely on StretchBlt this time, but the old standard BtBlt. It's faster, and for this straight copy, it's all we need.

From the bitmap we'll create a graphics object. We'll use that graphics object's device context, and one we can acquire from the window using it's window handle, to copy the contents of the window's device context directly to the bitmap. Once we've done that, a few lines to release the device contexts that we acquired, and another to dispose of that graphics object, and we're home free. We've just created an image of that window, stored in the bitmap. Pretty slick eh? Yeah, it's been done a thousand times, and I'll probably get more heat, acting like I came up with this. Obviously I didn't, I just wanted to show you how to do it.

Capturing a Window's Icon as an Image

Finally, the last little fun topic. How does the Window's Task Manager display the icon a window displays in it's title bar? This one, is pretty easy. We'll send the window the WM_GETICONmessage, and wait for it to hand a handle to it's icon back to us. If that doesn't work, I've delved deep into the heart of Litestep, and implemented their technique for backup icon retrieval. Let's look at the code for snagging a window's icon as an image.

public static Bitmap GetWindowSmallIconAsBitmap(int handle)
{   
    try
     {
          int result;
          IntPtr hWnd = new IntPtr(handle); 
          Win32.SendMessageTimeout(
            hWnd,
            Win32.WM_GETICON,
            Win32.ICON_SMALL, 
            0, 
            Win32.SMTO_ABORTIFHUNG, 
            1000, out result);
        
        IntPtr hIcon = new IntPtr(result);
        if (hIcon == IntPtr.Zero)
          {
               result = Win32.GetClassLong(hWnd, Win32.GCL_HICONSM);
            hIcon = new IntPtr(result);
        }
 
        if (hIcon == IntPtr.Zero)
          {
            Win32.SendMessageTimeout(
                hWnd, 
                Win32.WM_QUERYDRAGICON, 
                0, 
                0, 
                Win32.SMTO_ABORTIFHUNG, 
                1000, out result);
            hIcon = new IntPtr(result);
        }
        
        if (hIcon == IntPtr.Zero)
               return null;
          else
               return Bitmap.FromHicon(hIcon);
     }
     catch(Exception)
     {
       }
     return null;
}

First thing to note is the use of the SendMessageTimeout API, you want to be careful here as the window might not respond with an answer. It might be locked up for whatever reason, and then you are screwed. Your app, in turn will be screwed. So we'll send the message and specify a timeout to save us if the window doesn't respond in that amount of time. If that fails, we'll try snagging a handle to the icon from the window's bit information. Each window stores handles to things in it's class data. You can query that with the GetClassLong API. And finally, if both fail, we'll try asking the window for the icon it'd display if it were being drug about. Again, for the last two methods, credit must go to the Litestep development team, they are super smart, and thought that one up. I just ported it to C#.

Again, this method takes a handle to a window, and then returns an image containing that window's icon. If you wanted to snag all of the icon's for every top level window in the system, try using the EnumWindows API. Creating a task manager is beyond the scope of this article, so I just passed the main form's window handle to this method, and retrieve my own icon. Try different window handles, break out Spy++ to find them, or code in some other retrieval methods like FindWindow or FindWindowEX if you are after a single window. If you want advice on that, I'll be happy to help, just give me a yell. I've written task manager and a shell replacement, so I know my way around when it comes to get what I want from a window handle.

Spy++ Style Window Finder

Ok, this isn't the first time you've seen it, and it probably won't be the last. I didn't think it up, so I really don't deserve a lot of credit. This is just also cool. A few comments from the first posting of this article motivated me to include a window finder tool, just like Spy++.

So anyway, here's what's up. Just open the spy window, and left click and drag out the finder tool over any window. If you haven't figured it out by now, that everything you see is a window, whether it's a button or ListView, it's got a window handle. I guess that's why it's called Windows, and not GUIWidgets or something eh?

The code is pretty simple. It just uses the SetCapture/ReleaseCapture APIs to snag the mouse movements in or out of the spy window, as long as one of the mouse buttons are down. I went with the standard left-click-drag-over-a-window technique most of us are familiar with because of Spy++. I tried to mimic it as close as I could, but hey, this was a quick hack in the last two nights, so don't flame the crap out of me because it's not perfect.

The window highlighting was pretty simple too, just snagged the device context from the window under the cursor, and drew a rectangle around it. When the highlighting is done, it just forces the window to redraw itself with a few other handy APIs.

The window highlighting was by far the most interesting bit of the code, as Spy++ has always fascinated me, leaving me with countless hours of my coding life spent "highlighting" windows around on the desktop just to see that stupid rectangle show me where the window was that I was hovering over. Try it out, it's pretty fun for some reason to get a look at how various windows are composed of child windows.

Understanding the Sample

The sample code provides a mixture of GDI and GDI+ methods. All of the Windows APIs have been declared in the Win32.cs class. Stay out of there if you are squeamish. Nothing but declarations and other fun things for the guys that like that sort of thing.

The real fun code is in the ScreenCapturing class. If you want to play around, try changing the main form's icon and see what is returned. Or try saving the images to a file. The main focus of the article was to help you understand how taking screen captures can be accomplished. For some, this is no big news, for others just starting out, this kind of a thing was big fun, but always seemed like I couldn't find any good examples to learn from. I hope, this will shed some light on this subject if it was dark and mysterious before!

Enjoy the code, and give me a shout if you like it! Maybe even a vote or two, a little thanks goes a long way in motivating me to write more articles! Thanks for reading!

History

Sometime in Februrary 2005 I posted the article.
Sometime in March 2005 I updated it.