Capturing Web Type Content to a Single Image

NutSoft

4.33/5 (2 votes)

Nov 22, 2006

CPOL

4 min read

45836

195

An article on how to grab Web type content and capture an image to a single file

Download demo project - 12.37 KB

Sample Image - XML_Tool1.gif

Sample screenshot

Introduction

This article demonstrates one method of capturing Web type content (in this example I capture XML) to a single image file. My original problem that I'd been trying to overcome was to somehow show a directory structure in a document, and at that time I didn't just want to take a screen shot of Windows Explorer. I stumbled across a small sample application which I recorded as a .NET console application that enabled me to iterate through the directories recursively and list files in those directories, and capture the result to an XML file. Once I'd done that, I wanted to take the view of the XML file (with some nodes expanded and others collapsed) as seen when you view an XML file in Internet Explorer. The problem I encountered was that my view was longer than my monitor, so my screen shot had to be manually composed of several pieces. I did try to use the DrawBitmap method to capture the output, but this proved problematic and very often left me with completely blank files.

My chosen solution was to determine the screen coordinates of an embedded WebBrowser and then take a series of snapshots which I joined into a single file. This solution also caters to instances where the length of the Web content isn't an exact multiple of the size of the browser. To make things a little more interesting, I did the production of the XML as a BackgroundWorker task, and displayed progress in the status bar. The asynchronous task can be cancelled prior to completion.

Using the Code

As mentioned above, my example generates an XML file which is then displayed in an embedded WebBrowser control. If you want to display other Web content, then you should use a URL in the WebBrowser.Navigate method as below:

webBrowser1.Navigate(urlToDisplay);

The topic of this example is saving image so I'll go through the functionality around that in a small section taken from the demo application. The first step is to declare a load of int variables to hold the coordinates. My WebBrowser is in a Panel in a Form so the real screen X and Y location of the top left pixel is calculated by adding up all of their respective X and Y locations. Note: I added 5 to X and 30 to Y to overcome the forms frame - is there a programmatic way around this?

The next pair of int variables hold the size of the WebBrowser.Document (which may be greater than the size of the screen). And then we have some variables for the visible size, and finally a pair for offsets.

int realBrowserPositionX = 
	this.Location.X + panel1.Location.X + webBrowser1.Location.X + 5;
int realBrowserPositionY = 
	this.Location.Y + panel1.Location.Y + webBrowser1.Location.Y + 30;
int browserFullWidth = webBrowser1.Document.Body.ScrollRectangle.Width;
int browserFullHeight = webBrowser1.Document.Body.ScrollRectangle.Height;
int browserWindowWidth = webBrowser1.Width;
int browserWindowHeight = webBrowser1.Height;
int browserOffsetX = 0;
int browserOffsetY = 0;

I then instantiate a GDI Bitmap the size of the full image and then a Graphics drawing surface from the Bitmap. These will be used to build the full image.

Bitmap fullImage = new Bitmap(browserFullWidth, browserFullHeight);
Graphics fullImageGraphics = Graphics.FromImage(fullImage);

The WebBrowser is scrolled so that the top of the image is in view, and then I create some variables to hold the calculated size of the image (in cases it doesn't fill the WebBrowser). I then instantiate another GDI Bitmap the size of the image segment and then a Graphics drawing surface from the new Bitmap. I use the Graphics.CopyFromScreen method to grab the displayed segment of the image, and then the Graphics.DrawImage method to add it to the full-size image. After that I scroll the WebBrowser by the size of the window.

I repeat this process in a loop until I have gathered all of the image.

webBrowser1.Document.Body.ScrollTop = 0; 
do
{
	int actualImageSegmentHeight = Math.Min(browserWindowHeight, 
		browserFullHeight - browserOffsetY);
	int actualBrowserWindowOffsetY = Math.Min(browserOffsetY, 
		browserWindowHeight - actualImageSegmentHeight);
	Bitmap sectionOfImage = 
		new Bitmap(browserWindowWidth, actualImageSegmentHeight);
	Graphics sectionOfImageGraphics = Graphics.FromImage(sectionOfImage);
	sectionOfImageGraphics.CopyFromScreen(realBrowserPositionX,
 		realBrowserPositionY + actualBrowserWindowOffsetY,
 		0, 0, new Size(browserWindowWidth, actualImageSegmentHeight),
 		CopyPixelOperation.SourceCopy);
	fullImageGraphics.DrawImage(sectionOfImage, browserOffsetX, browserOffsetY,
 		browserWindowWidth, actualImageSegmentHeight);
		browserOffsetY += browserWindowHeight;
		webBrowser1.Document.Body.ScrollTop += browserWindowHeight;
}
while (browserOffsetY < browserFullHeight);

I hope that the example is easy to follow. I had considered splitting it into a number of separate files, but thought that this single module was not excessively large.

Using the Demo Application

Click on the top ComboBox and select "Browse.." and then browse to a directory with some sub-directories in it. Then click on the Start button. The resulting XML file is displayed in the WebBrowser window, and you can interact with it in the usual manner (click on a "-" to close an XmlElement and on a "+" to open). Click on the Save to Image button to save the view in the WebBrowser to a file.

Points of Interest

Whilst tinkering with this application, I decided to have a go at using BackgroundWorker to run my task in the background. Starting and prematurely stopping the task is easy enough as shown in the example below:

// Start the BackgroundWorker task
this.backgroundWorker1.RunWorkerAsync(); 
// Stop the BackgroundWorker task
this.backgroundWorker1.CancelAsync();

In order for your application to be able to stop on demand, you need to check if the BackgroundWorker task should stop as soon as it has been started, and at regular intervals throughout the task like below (this should be done in each of your time consuming functions):

// Start the BackgroundWorker task
if (worker.CancellationPending)
{
	// Set e.Cancel and then drop out without doing any further work
	e.Cancel = true;
}
else
{
	// Do some background work...
}

Reporting progress from the BackgroundWorker task is achieved by calling the worker.ReportProgress(int progress) at some suitable point.

History

22^nd November, 2006: Initial post