Click here to Skip to main content
Click here to Skip to main content

Image Capture Whole Web Page using C#

By , 22 Jun 2005
 

Sample Image - capture.gif

Introduction

This article presents a C# routine for capturing an entire web page as an image. Many capture examples show how to grab a screen shot, but do not show how to gather information that is below the scrolling region of an application. The most common example of a scrolling problem or “run-over” program is a web page.

This application grabs the page, plus, as a bonus, it demonstrates how to let the client adjust the size of the image and the quality of the JPEG. It shows how to write the name of the webpage onto the image, draw Standard Resolution Guides, save a bitmap as a JPEG and open the directory where the captures are stored.

Background

In a recent application, I wanted to provide our Quality Assurance testers the ability to capture an entire web page. I wanted them to do this by clicking a button from within a BHO (Browser Helper Object) that is used for another testing task. I also wanted to reduce the size of the capture, because the images are e-mailed and can quickly fill up our mailbox quotas.

Using the code

The easiest way to use this code is to download the source, trim out the code functions that may not be wanted (quality of capture, size of image, URL writing, guides, or the open directory function). After the code is trimmed down and the program can compile without errors, copy the source and its dependencies into the desired project.

The first issue to face when copying the source code into a project is the need to refer SHDocVw.dll and MSHTML.dll. In Visual Studio, go to Project, Add Reference, and then select the COM tab. Now, go down to the Microsoft section and look for "Microsoft Internet Controls". Select it, and then find "Microsoft HTML Object Library" (see the above image).

After adding the references, add these necessary directives into the project. (A few other directives are needed, if the code is not loaded into a form.)

using System.Text;
using System.Runtime.InteropServices;
using System.Diagnostics;
using System.IO;
using System.Drawing.Imaging;
using SHDocVw;
using mshtml;

Import user32 functions

[DllImport("user32.dll", CharSet=CharSet.Auto)]
public static extern IntPtr FindWindowEx(IntPtr parent /*HWND*/, 
  IntPtr next /*HWND*/, string sClassName, IntPtr sWindowTitle);

[DllImport("user32.dll", ExactSpelling=true, CharSet=CharSet.Auto)] 
public static extern IntPtr GetWindow(IntPtr hWnd, int uCmd); 

[DllImport("user32.Dll")]
public static extern void GetClassName(int h, StringBuilder s, int nMaxCount);

[DllImport("user32.dll")]
private static extern bool PrintWindow(IntPtr hwnd, IntPtr hdcBlt, uint nFlags);

public const int GW_CHILD = 5; 
public const int GW_HWNDNEXT = 2;

Find an open browser and assign a browser document for it.

 SHDocVw.WebBrowser m_browser = null;
 SHDocVw.ShellWindows shellWindows = new SHDocVw.ShellWindowsClass();
 
 //Find first availble browser window.
 //Application can easily be modified to loop through and 
 //capture all open windows.
 string filename;
  foreach (SHDocVw.WebBrowser ie in shellWindows)
  {
      filename = Path.GetFileNameWithoutExtension(ie.FullName).ToLower();
      if (filename.Equals("iexplore"))
      {
          m_browser = ie;
          break;  
      }
  }
  if (m_browser == null)
  {   
      MessageBox.Show("No Browser Open");
      return;
  }

  //Assign Browser Document
  mshtml.IHTMLDocument2 myDoc = (mshtml.IHTMLDocument2)m_browser.Document;

The width and height of the web page must be determined along with the resolution settings of the clients screen.

 //Set scrolling on.
 myDoc.body.setAttribute("scroll", "yes", 0);
 
 //Get Browser Window Height
 int heightsize = (int)myDoc.body.getAttribute("scrollHeight", 0);
 int widthsize = (int)myDoc.body.getAttribute("scrollWidth", 0);
 
 //Get Screen Height
 int screenHeight = (int)myDoc.body.getAttribute("clientHeight", 0);
 int screenWidth = (int)myDoc.body.getAttribute("clientWidth", 0);

To capture the whole web page, fragments of the page will have to be grabbed and stitched together to make the whole page. After the first fragment is captured, the browser is scrolled down for the next capture. As the fragments are captured, they are stitched into a target bitmap. The process is repeated until the whole page is captured. For pages that are wider than the clients screen, the page gets scrolled over horizontally, and then the above process is repeated.

 //Get bitmap to hold screen fragment.
 Bitmap bm = new Bitmap(screenWidth, screenHeight, 
    System.Drawing.Imaging.PixelFormat.Format16bppRgb555);
 
 //Create a target bitmap to draw into.
 Bitmap bm2 = new Bitmap(widthsize + URLExtraLeft, heightsize + 
    URLExtraHeight - trimHeight, 
         System.Drawing.Imaging.PixelFormat.Format16bppRgb555);
 Graphics g2 = Graphics.FromImage(bm2);
 
 Graphics g = null;
 IntPtr hdc;
 Image screenfrag = null;
 int brwTop = 0;
 int brwLeft = 0;
 int myPage = 0;
 IntPtr myIntptr = (IntPtr)m_browser.HWND;
 
 //Get inner browser window.
 int hwndInt = myIntptr.ToInt32();
 IntPtr hwnd = myIntptr;
 hwnd = GetWindow(hwnd, GW_CHILD); 
 StringBuilder sbc = new StringBuilder(256);
 
 //Get Browser "Document" Handle
 while (hwndInt != 0) 
 { 
     hwndInt = hwnd.ToInt32();
     GetClassName(hwndInt, sbc, 256);
 
     if(sbc.ToString().IndexOf("Shell DocObject View", 0) > -1)
     {
         hwnd = FindWindowEx(hwnd, IntPtr.Zero, 
             "Internet Explorer_Server", IntPtr.Zero);
         break;
     }                
     hwnd = GetWindow(hwnd, GW_HWNDNEXT);
  } 
 
 //Get Screen Height (for bottom up screen drawing)
 while ((myPage * screenHeight) < heightsize)
 {
     myDoc.body.setAttribute("scrollTop", (screenHeight - 5) * myPage, 0);
     ++myPage;
 }
 
 //Rollback the page count by one
 --myPage;
 
 int myPageWidth = 0;
  while ((myPageWidth * screenWidth) < widthsize)
 {
     myDoc.body.setAttribute("scrollLeft", (screenWidth - 5) * myPageWidth, 0);
     brwLeft = (int)myDoc.body.getAttribute("scrollLeft", 0);
     for (int i = myPage; i >= 0; --i)
     {
         //Shoot visible window
         g = Graphics.FromImage(bm);
         hdc = g.GetHdc();
         myDoc.body.setAttribute("scrollTop", (screenHeight - 5) * i, 0);
         brwTop = (int)myDoc.body.getAttribute("scrollTop", 0);
         PrintWindow(hwnd, hdc, 0);
         g.ReleaseHdc(hdc);
         g.Flush();
         screenfrag = Image.FromHbitmap(bm.GetHbitmap());
         g2.DrawImage(screenfrag, brwLeft + URLExtraLeft, brwTop + 
            URLExtraHeight);
     }
     ++myPageWidth;
 }

Finally, save the above target to a time stamped JPEG file.

Points of Interest

I had a lot of fun and suffered a lot of frustration with this project. The captures are really nice. Try it out on one of the "Code Project" pages.

Not shown in this article, but available in the source is the saving of the file to JPEG. I tried GIF and bitmap, but settled on JPEG for size. The main goal was to be able to e-mail these files without taking up a lot of our mailbox quota.

In the actual application, I have an option to copy the file to the clipboard. I never was able to get the clipboard image into a "device dependent bitmap" state that didn't take up much size. I would copy the image, and then paste it into my Outlook e-mail, only to have the e-mail be about a MB big. When I would open the JPEG in Photoshop, then select it, copy it and paste it into Outlook, the Adobe device dependent bitmap was under 100 KB. The same happened with the simple Windows Paintbrush application.

Because of time constraints, I settled on just copying the JPEG file to Outlook. Any solutions on how to turn a large device independent bitmap into a bitmap with a small memory footprint would be welcomed.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

About the Author

Douglas M. Weems
Web Developer
Canada Canada
Member
My experience with programming began with Turbo Pascal while working on my Physics degree back in 1989.
 
After getting out of school, I used pre-VBA Excel macros to write some really fancy applications to help with the job I was doing. This inspired me to try to write "Windows" programs and to search out Visual Basic 3.0.
 
I wrote a bunch of small applications and ran them against Access and FoxPro. However, this still wasn't my primary job.
 
In 1994, I went on my first contract, a 3-month deal that turned into 3-years. I learned a lot more about development. Development was in VB3, VB4 and ASP. I got a chance to admin NT4 and SQL Server 6 and 6.5.
 
After moving on to another company, I spent another 2 years with VB and then 5 years with Java and JSP.
 
In March of 2004, I installed Visual Studio 2003. I tasted C#, and became hopelessly addicted.
 
My other interests are my 3 sons, my wife Smile | :) , metal detecting, yard work, travel and learning new things.
 
location: Atlanta, Georgia

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
QuestionNOt working in IE 8memberravindrasinghbhati26 Mar '13 - 0:21 
it is not working properly in IE 8
QuestionSEE HERE HOW TO MAKE IT WORK ON LATER VERSIONS OF IEmembershay_e26 Oct '12 - 3:53 
works on ie9:
 
How to capture Internet Explorer ScreenShot using C#?[^]
GeneralMy vote of 5membermanoj kumar choubey26 Feb '12 - 21:28 
Nice
QuestionBad Image in Windows 7 with Visual Studio 2010memberYouzelin15 Feb '12 - 22:00 
hi,
 
Thank you for your good sharing! I have downloaded the source code and convert it into the Visual Studio 2010 project. When I run the project and click the capture button, go to the directory and I just saw a very LARGE BLACK IMAGE. Sigh | :sigh: Confused | :confused:
AnswerRe: Bad Image in Windows 7 with Visual Studio 2010memberSabarinathan Arthanari21 Jun '12 - 3:03 
Same error happening in Windows Vista 32 bit with Visual Studio 2010
Sabarinathan Arthanari
As a child of God (Truth/Love), I am greater than anything that can happen to me -Dr APJ Abdul Kalam.

GeneralMy vote of 5membernagendrasharma28 Nov '11 - 23:32 
This is excellent Post, can you please let me know whether same thing is possible for other browsers like firefox and chrome etc
QuestionNew Problem Just StartedmemberRon Mittelman8 Nov '11 - 6:38 
All of a sudden, the code just stopped working properly. The resulting image is the proper height, but instead of giving the entire web page as an image, it just gives the first screen full, with the balance of the image height being black.
 
While stepping through the code, it appears that
myDoc.body.setAttribute("scrollTop", (screenHeight - 5) * i, 0);
just stopped working. The IE window does not visibly scroll around the way it used to.
 
This just started happening after I installed an update to Adobe Flash and/or Reader (I did both of them, so don't know which one caused the problem).
 
Does anybody have an idea why that would stop working?
 
Thanks...
QuestionBlack Pics ReduxmemberRon Mittelman25 Oct '11 - 6:20 
Can anyone explain the black picture issue? Has it been solved? I tried various settings of percentages with no luck. Thanks...
AnswerRe: Black Pics ReduxmemberRon Mittelman8 Nov '11 - 6:32 
Found the solution to the black screen problem after some googling.
 
Replace this:
 //Get Browser "Document" Handle
 while (hwndInt != 0) 
 { 
     hwndInt = hwnd.ToInt32();
     GetClassName(hwndInt, sbc, 256);
 
     if(sbc.ToString().IndexOf("Shell DocObject View", 0) > -1)
     {
         hwnd = FindWindowEx(hwnd, IntPtr.Zero, 
             "Internet Explorer_Server", IntPtr.Zero);
         break;
     }                
     hwnd = GetWindow(hwnd, GW_HWNDNEXT);
  } 
with this:
 //Get Browser "Document" Handle
 while (hwndInt != 0) 
 { 
     hwndInt = hwnd.ToInt32();
     GetClassName(hwndInt, sbc, 256);
 
     if (sbc.ToString().IndexOf("Shell DocObject View", 0) > -1) // pre-IE7
     {
         hwnd = FindWindowEx(hwnd, IntPtr.Zero, "Internet Explorer_Server", IntPtr.Zero);
         break;
     }
 
     if (sbc.ToString().IndexOf("TabWindowClass", 0) > -1) // IE7
     {
         hwnd = FindWindowEx(hwnd, IntPtr.Zero, "Shell DocObject View", IntPtr.Zero);
         hwnd = FindWindowEx(hwnd, IntPtr.Zero, "Internet Explorer_Server", IntPtr.Zero);
         break;
     }                
 
     if (sbc.ToString().IndexOf("Frame Tab", 0) > -1) // IE8
     {
         hwnd = FindWindowEx(hwnd, IntPtr.Zero, "TabWindowClass", IntPtr.Zero);
         hwnd = FindWindowEx(hwnd, IntPtr.Zero, "Shell DocObject View", IntPtr.Zero);
         hwnd = FindWindowEx(hwnd, IntPtr.Zero, "Internet Explorer_Server", IntPtr.Zero);
         break;
     }                
     hwnd = GetWindow(hwnd, GW_HWNDNEXT);
  } 
 
Only tested on IE 7, so may require some tweaking...
QuestionCan this be used in a ASP page or only WinForms ?memberWhiskeyBusiness12 Sep '11 - 11:16 
Hi Douglas,
Great article! one question though... Can this be used in a ASP page or only WinForms ?
Thanks,
GeneralBlack Pics.memberI-wa-n27 Mar '11 - 9:43 
Hi guys, running this app i only get black pics, in size of the web page (e.g. at google and so on...)
IE6
 
iwan
GeneralRe: Black Pics.memberI-wa-n27 Mar '11 - 9:44 
Does someone have a suggestion what the raison might be?
GeneralRe: Black Pics.membercharles henington31 May '11 - 14:48 
not sure why it would be giving you black images one have not gone over the documentation closely and 2 i use a different method that uses webBrowser not shdocvw. The method that I downloaded was originally on the msdn site but had a fixed size of something like 1024 X 768. I worked out the kinks on the Size using HtmlDocument to get scrollWidth as width +25 and scrollHight as height.
 
namespace Chico
{
	using System;
	using System.Drawing;
	using System.Threading;
	using System.Windows.Forms;
    using MSHTML;
    using System.Runtime.InteropServices;
 
    [StructLayout(LayoutKind.Sequential, Pack = 4)]
    public struct Rect
    {
        public int Left;
 
        public int Top;
 
        public int Right;
 
        public int Bottom;
    }
 
    public static class NativeMethods
    {
        private const int SM_CXVSCROLL = 2;
 
        [ComImport]
        [Guid("0000010D-0000-0000-C000-000000000046")]
        [InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
        private interface IViewObject
        {
            void Draw([MarshalAs(UnmanagedType.U4)] uint dwAspect, int lindex, IntPtr pvAspect, [In] IntPtr ptd, IntPtr hdcTargetDev, IntPtr hdcDraw, [MarshalAs(UnmanagedType.Struct)] ref Rect lprcBounds, [In] IntPtr lprcWBounds, IntPtr pfnContinue, [MarshalAs(UnmanagedType.U4)] uint dwContinue);
        }
 
        public static int GetSystemMetrics()
        {
            return GetSystemMetrics(SM_CXVSCROLL);
        }
 
        [DllImport("user32.dll")]
        public static extern int GetSystemMetrics(int smIndex);
 
        public static void GetImage(object obj, Image destination, Color backgroundColor)
        {
            using (var graphics = Graphics.FromImage(destination))
            {
                var deviceContextHandle = IntPtr.Zero;
                var rectangle =
                    new Rect
                    {
                        Right = destination.Width,
                        Bottom = destination.Height
                    };
 
                graphics.Clear(backgroundColor);
                try
                {
                    deviceContextHandle = graphics.GetHdc();
 
                    var viewObject = obj as IViewObject;
                    viewObject.Draw(1, -1, IntPtr.Zero, IntPtr.Zero, IntPtr.Zero, deviceContextHandle, ref rectangle, IntPtr.Zero, IntPtr.Zero, 0);
                }
                finally
                {
                    if (deviceContextHandle != IntPtr.Zero)
                    {
                        graphics.ReleaseHdc(deviceContextHandle);
                    }
                }
            }
        }
    }
	public class HtmlToBitmapConverter : IDisposable
	{
		private const int SleepTimeMiliseconds = 5000;
        internal static Bitmap Output;
        
		public Bitmap Render(Uri uri, Size size)
		{
			var browser = CreateBrowser(size);            
			NavigateAndWaitForLoad(browser, uri, 0);
            Output = GetBitmapFromControl(browser, size);
            return Output;			
		}
 
		private void NavigateAndWaitForLoad(WebBrowser browser, Uri uri, int waitTime)
		{
			browser.Navigate(uri);
			var count = 0;
 
			while (browser.ReadyState != WebBrowserReadyState.Complete)
			{
				Thread.Sleep(SleepTimeMiliseconds);
				
				Application.DoEvents();
				count++;
				
				if (count > waitTime / SleepTimeMiliseconds)
				{
					break;
				}
			}
 
			while (browser.Document.Body == null)
			{
				Application.DoEvents();
			}
 
			HideScrollBars(browser);
		}
 
		private void HideScrollBars(WebBrowser browser)
		{
			const string Hidden = "hidden";
			var document = (IHTMLDocument2)browser.Document.DomDocument;
			var style = (IHTMLStyle2)document.body.style;
			style.overflowX = Hidden;
			style.overflowY = Hidden;            
		}
 
		private WebBrowser CreateBrowser(Size size)
		{
			var 
				newBrowser =
					new WebBrowser
					{
						ScrollBarsEnabled = false,
						ScriptErrorsSuppressed = true,
						Size = size
					};
 
			newBrowser.BringToFront();
 
			return newBrowser;
		}
 
		private Bitmap GetBitmapFromControl(WebBrowser browser, Size size)
		{
			var bitmap = new Bitmap(size.Width, size.Height);
 
			NativeMethods.GetImage(browser.Document.DomDocument, bitmap, Color.White);
			return bitmap;
		}
 
        void IDisposable.Dispose()
        {
            GC.SuppressFinalize(this);
        }
 

    }
}
 
to call the method in code can be done like this
 
private void button1_Click(object sender, EventArgs e)
{
   HtmlDocument doc = webBrowser1.Document;
   Size pageSize = new Size(doc.Body.ScrollRectangle.Width + 25, doc.Body.ScrollRectangle.Height);
   new HtmlToBitmapConverter().Render(new Uri(webBrowser1.Url.AbsoluteUri), pageSize);
HtmlToBitmapConverter.Output.Save(@"C:\image.png");
}

AnswerRe: Black Pics.memberRavi Sant20 Jun '11 - 1:38 
Solution for BlackImages[^]
// ♫ 99 little bugs in the code,
// 99 bugs in the code
// We fix a bug, compile it again
// 101 little bugs in the code ♫

GeneralCode doesn't recognize scroll bars [modified]memberdsathishkumar8 Mar '11 - 22:36 
//Get Browser Window Height
int heightsize = (int)myDoc.body.getAttribute("scrollHeight", 0);
int widthsize = (int)myDoc.body.getAttribute("scrollWidth", 0);

//Get Screen Height
int screenHeight = (int)myDoc.body.getAttribute("clientHeight", 0);
int screenWidth = (int)myDoc.body.getAttribute("clientWidth", 0);
 
in the above code heightsize =screenHeight and widthsize =screenWidth
I am using IE8. Beacuse of this, the code doesn't recognize the scroll bars. only a part of the webpage is created as a image.

modified on Wednesday, March 9, 2011 5:10 AM

GeneralRe: Code doesn't recognize scroll barsmemberapex7514 Mar '11 - 23:08 
Maybe the page has Frames? Unfortunately the code in this article doesn't know how to handle them.
 
If you really need this, you will need to rewrite the code! But this won't be easy!
GeneralRe: Code doesn't recognize scroll barsmembercharles henington31 May '11 - 14:57 
to get window width and height use
 
int heightsize = myDoc.Body.ScrollRectangle.Height;
int widthsize = myDoc.Body.ScrollRectangle.Width +25;

GeneralMy vote of 2membertinku5nov8 Jan '11 - 2:11 
outdated
RantRe: My vote of 2membercharles henington31 May '11 - 15:03 
can you not read that it was last updated in 2005 of course its outdated
GeneralSHDocVw.ShellWindows shellWindows = new SHDocVw.ShellWindowsClass();memberforeign26 Jan '11 - 0:50 
i use the code in a web page.It runs my localhost but in server i get error.My server is windows server 2003 ,iis 6.0.
i get the following error in the row :
SHDocVw.ShellWindows shellWindows = new SHDocVw.ShellWindowsClass();
 
Error:
Retrieving the COM class factory for component with CLSID {9BA05972-F6A8-11CF-A442-00A0C90A8F39} failed due to the following error: 80070002.
GeneralIE9 and WatiNmemberDaaron13 Oct '10 - 6:36 
I'm trying to use the code in WatiN to capture IE9, but only get black images. Any ideas?
Cheers,
Daaron

GeneralTrying to use C# with IE9 and MSHTMLmemberBobFr30 Sep '10 - 13:39 
IE9 seems to have broken at least some aspects of the MSHTML API for Ie9. Is there any documentation or information available about the API?
QuestionBlank Image in IE9 betamemberjyotijv15 Sep '10 - 23:03 
Hi all,
 
I have used this code in my application to capture the image of the web page. It works fine in IE6, IE7 and IE8. But in IE 9 beta I am getting blank image. Kindly anybody help me to solve this problem.
 
Thank you,
Jyoti
GeneralMy vote of 5memberUma Sankar Achary20 Jul '10 - 1:08 
Excellent, its very usefull.. Thanks for posting.Uma S
GeneralMemory leak someplace.memberChicagoBobT30 Jun '10 - 10:22 
Implemented the code in WPF and after calling it via a timer to refresh the page I can say I am leaking memory. So if you have any ideas on how to catch this leak please let me know.
Thanks,

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web01 | 2.6.130516.1 | Last Updated 22 Jun 2005
Article Copyright 2005 by Douglas M. Weems
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid