Click here to Skip to main content
Click here to Skip to main content

Capturing Web Type Content to a Single Image

, 22 Nov 2006
Rate this:
Please Sign up or sign in to vote.
An article on how to grab Web type content and capture an image to a single file

Sample Image - XML_Tool1.gif

Sample screenshot

Introduction

This article demonstrates one method of capturing Web type content (in this example I capture XML) to a single image file. My original problem that I'd been trying to overcome was to somehow show a directory structure in a document, and at that time I didn't just want to take a screen shot of Windows Explorer. I stumbled across a small sample application which I recorded as a .NET console application that enabled me to iterate through the directories recursively and list files in those directories, and capture the result to an XML file. Once I'd done that, I wanted to take the view of the XML file (with some nodes expanded and others collapsed) as seen when you view an XML file in Internet Explorer. The problem I encountered was that my view was longer than my monitor, so my screen shot had to be manually composed of several pieces. I did try to use the DrawBitmap method to capture the output, but this proved problematic and very often left me with completely blank files.

My chosen solution was to determine the screen coordinates of an embedded WebBrowser and then take a series of snapshots which I joined into a single file. This solution also caters to instances where the length of the Web content isn't an exact multiple of the size of the browser. To make things a little more interesting, I did the production of the XML as a BackgroundWorker task, and displayed progress in the status bar. The asynchronous task can be cancelled prior to completion.

Using the Code

As mentioned above, my example generates an XML file which is then displayed in an embedded WebBrowser control. If you want to display other Web content, then you should use a URL in the WebBrowser.Navigate method as below:

webBrowser1.Navigate(urlToDisplay);

The topic of this example is saving image so I'll go through the functionality around that in a small section taken from the demo application. The first step is to declare a load of int variables to hold the coordinates. My WebBrowser is in a Panel in a Form so the real screen X and Y location of the top left pixel is calculated by adding up all of their respective X and Y locations. Note: I added 5 to X and 30 to Y to overcome the forms frame - is there a programmatic way around this?

The next pair of int variables hold the size of the WebBrowser.Document (which may be greater than the size of the screen). And then we have some variables for the visible size, and finally a pair for offsets.

int realBrowserPositionX = 
	this.Location.X + panel1.Location.X + webBrowser1.Location.X + 5;
int realBrowserPositionY = 
	this.Location.Y + panel1.Location.Y + webBrowser1.Location.Y + 30;
int browserFullWidth = webBrowser1.Document.Body.ScrollRectangle.Width;
int browserFullHeight = webBrowser1.Document.Body.ScrollRectangle.Height;
int browserWindowWidth = webBrowser1.Width;
int browserWindowHeight = webBrowser1.Height;
int browserOffsetX = 0;
int browserOffsetY = 0;

I then instantiate a GDI Bitmap the size of the full image and then a Graphics drawing surface from the Bitmap. These will be used to build the full image.  

Bitmap fullImage = new Bitmap(browserFullWidth, browserFullHeight);
Graphics fullImageGraphics = Graphics.FromImage(fullImage);

The WebBrowser is scrolled so that the top of the image is in view, and then I create some variables to hold the calculated size of the image (in cases it doesn't fill the WebBrowser). I then instantiate another GDI Bitmap the size of the image segment and then a Graphics drawing surface from the new Bitmap. I use the Graphics.CopyFromScreen method to grab the displayed segment of the image, and then the Graphics.DrawImage method to add it to the full-size image. After that I scroll the WebBrowser by the size of the window.

I repeat this process in a loop until I have gathered all of the image.

webBrowser1.Document.Body.ScrollTop = 0; 
do
{
	int actualImageSegmentHeight = Math.Min(browserWindowHeight, 
		browserFullHeight - browserOffsetY);
	int actualBrowserWindowOffsetY = Math.Min(browserOffsetY, 
		browserWindowHeight - actualImageSegmentHeight);
	Bitmap sectionOfImage = 
		new Bitmap(browserWindowWidth, actualImageSegmentHeight);
	Graphics sectionOfImageGraphics = Graphics.FromImage(sectionOfImage);
	sectionOfImageGraphics.CopyFromScreen(realBrowserPositionX,
 		realBrowserPositionY + actualBrowserWindowOffsetY,
 		0, 0, new Size(browserWindowWidth, actualImageSegmentHeight),
 		CopyPixelOperation.SourceCopy);
	fullImageGraphics.DrawImage(sectionOfImage, browserOffsetX, browserOffsetY,
 		browserWindowWidth, actualImageSegmentHeight);
		browserOffsetY += browserWindowHeight;
		webBrowser1.Document.Body.ScrollTop += browserWindowHeight;
}
while (browserOffsetY < browserFullHeight);

I hope that the example is easy to follow. I had considered splitting it into a number of separate files, but thought that this single module was not excessively large.

Using the Demo Application

Click on the top ComboBox and select "Browse.." and then browse to a directory with some sub-directories in it. Then click on the Start button. The resulting XML file is displayed in the WebBrowser window, and you can interact with it in the usual manner (click on a "-" to close an XmlElement and on a "+" to open). Click on the Save to Image button to save the view in the WebBrowser to a file.

Points of Interest

Whilst tinkering with this application, I decided to have a go at using BackgroundWorker to run my task in the background. Starting and prematurely stopping the task is easy enough as shown in the example below:  

// Start the BackgroundWorker task
this.backgroundWorker1.RunWorkerAsync(); 
// Stop the BackgroundWorker task
this.backgroundWorker1.CancelAsync();

In order for your application to be able to stop on demand, you need to check if the BackgroundWorker task should stop as soon as it has been started, and at regular intervals throughout the task like below (this should be done in each of your time consuming functions):

// Start the BackgroundWorker task
if (worker.CancellationPending)
{
	// Set e.Cancel and then drop out without doing any further work
	e.Cancel = true;
}
else
{
	// Do some background work...
}

Reporting progress from the BackgroundWorker task is achieved by calling the worker.ReportProgress(int progress) at some suitable point.

History

  • 22nd November, 2006: Initial post

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

NutSoft
Web Developer
United Kingdom United Kingdom
15+ years experience developing software and integration components for; SharePoint, Microsoft Office, SQL Server, Oracle, Meridio, Kofax, Trim Context, Convera Retrievalware, Rightfax Gateway, etc.
Using; C#, .NET Framework (1.0, 1.1, 2.0, 3.0), ASP.NET, VB6, C++/C, Web Services, SOAP, XML, SMTP, MFC, Shell scripting, VB scripting, IBM MQ series, etc.
Platforms; Windows 9x/2000/NT/XP/Vista, Unix (Solaris, HP-UX, Tru64), OpenVMS, VAX/VMS

Comments and Discussions

 
GeneralMy vote of 5 Pinmemberjazan22-Mar-12 9:52 
Generalscroll bar hidden area are not saved. Pinmembervericon16-Oct-07 15:00 
my page is long to be viewed in on window and have vertical scroll bat
i expected the image to include all the page but it includes only the visible part.

GeneralRe: scroll bar hidden area are not saved. PinmemberNutSoft16-Oct-07 21:56 
GeneralNo demo to download... Pinmembermjesterak22-Nov-06 4:41 
GeneralRe: No demo to download... PinmemberNutSoft22-Nov-06 5:09 
GeneralRe: No demo to download... Pinmembermjesterak22-Nov-06 5:49 
GeneralRe: No demo to download... PinmemberNutSoft22-Nov-06 9:24 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web03 | 2.8.140721.1 | Last Updated 22 Nov 2006
Article Copyright 2006 by NutSoft
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid