Click here to Skip to main content
Click here to Skip to main content

Webpage thumbnailer

By , 17 Aug 2006
 

Introduction

Once upon a time, I got an idea to have a thumbnail of a webpage associated with each URL that I had on my list. Sort of a URL with "face" :). I began to research on how to implement it, and found no solution for it in the open software domain. Many that I found were commercial products in the form of components and standalone programs. I'm just a poor programmer, and couldn't spend much to buy such a component. I was sure that there would be a way (even several) to achieve this goal without touching my child's spare money. And what I present in this article is what I found.

Using the code

Before you get into the code, you can build it and play around with it. The URL should be in a form http://www.yoursite.com, for example. I didn't write URL validation code, so you have to be careful about it. The project is very simple, and there are only a couple of points that I have to mention. To get the thumbnail image of the webpage, I use the WebBrowser component that comes with Visual Studio 2005 and is a part of the .NET framework v.2. I placed it on a BrowserForm, and set the size of the form to approximately 600 to 800 pixels to get enough visual data. Then, the BrowserForm is initialized, but is actually never shown. And this makes a trick.

private void TestForm_Load(object sender, EventArgs e) {
 browserForm = new BrowserForm(); 
}

What I have to do then is to take a snapshot of the WebBrowser after the page is loaded. That's all!

  Bitmap docImage = new Bitmap(600, 800);
  webBrowser1.DrawToBitmap(docImage, new Rectangle(webBrowser1.Location.X, 
          webBrowser1.Location.Y, webBrowser1.Width, webBrowser1.Height));
}

The page takes some time to load, and because of that, I've split the getting of the image in to three steps:

  1. I call the method getImageFromUrl(string url) on a BrowserForm that starts downloading the page from a given URL.
  2. The WebBrowser event DocumentCompleted is handled by the procedure webBrowser1_DocumentCompleted. It sets the image of the current DocPic object.
  3. The setter of the current DocPic object triggers the refreshPicture event that updates the displayed picture. Some resizing is made on place.

All code provided is purely for demonstration purposes only, so don't try to find design issues in it. You'll certainly find it if you try.

History

It's nice to know that on your website I can write history :). Thanks to the admin!

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

About the Author

dooskoobi
Web Developer
Belgium Belgium
Member
No Biography provided

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
QuestionHow to fix blank image?memberMatt34229 Dec '06 - 14:16 
Hi! How can I fix the blank image problem? Does anyone have a simple solution?
 
Matt
QuestionASP.Net environmentmemberFranck Quintana26 Oct '06 - 7:24 
Hi,
 
First of all thank you for this great piece of code Wink | ;)
My question is:
Is it possible to make this code working on a asp.net website?
I would like to display thumbnails for previewing dynamic themes.
 
Thank you.
AnswerRe: ASP.Net environmentmemberdooskoobi26 Oct '06 - 10:51 
thanks ,but this piece of code is not really so great.
Just idea.
I didn't test it in asp environment but nothing forbids you to use hidden windows form in some other class inside web application I think.
generated tumbnail can be then send to client browser.
request > object that encapsulate hidden windows form with browser and code making snapshot > response .But some concurrency problems can arise if too many requests.This object can be shared and some queue for incoming urls can be foreseen.something like that .
I would like to see if it works but have few time to test it self...
Best regards
GeneralRe: ASP.Net environmentmemberFranck Quintana27 Oct '06 - 15:27 
I tried to encapsulate it inside an asp.net page but i can't find a way to handle the webBrowser1_DocumentCompleted event.
If I call Navigate the event is not called.
Rgds.
GeneralRe: ASP.Net environmentmembergedw9927 May '07 - 6:36 
You should probably look at running this out of process as Windows Service or console app.
 
Partyl because for a long running process like like this it wil be easier to control and you waont have the problems you are having now with thread context.
GeneralRe: ASP.Net environmentmemberFranck Quintana27 May '07 - 23:24 
Ok, in fact that's a good idea.
Thank you for your support.
Regards,
 
Franck Quintana
GeneralRe: ASP.Net environmentmemberDmitryKirsanov12 Oct '11 - 10:24 
Guys, the idea of loading web browser, MSIE, from aspx on your WEB SERVER with anything that your visitor will put into the address line is a trash horror movie from security point of view. There are better ways. My solution will be command line utility which will only be running on virtual machine. If any website will infect the operating system - so be it, but that will be virtual machine, not my web server.
GeneralFixed blank image [modified]memberRandall Stephens16 Oct '06 - 6:13 
I don't know why it works, but I found a way to fix the blank image issue. It probably isn't the cleanest code, but it's available at the following link:
 
http://www.surpluscode.com/2006/10/16/update-web-page-thumbnail-maker/
 
Basically, I have a form with the web browser control in it. Before I take a screen shot, I activate that form, re-activate the main form, and then perform the screen shot work. Also, I have another form and another web browser control that takes "blank" screen shots. If the screen shot from a actual web site matches the blank screen shot (bad), the code will re-try it for up to 5 times to get a real screen shot. Usually it only takes one or two tries to get it.
 
I would give exact details on where to make the changes, but as you can see from the link above, I have made some significant changes to the UI and code structure.
GeneralRe: Fixed blank image [modified]memberNutSoft16 Nov '06 - 1:37 
Has anybody got any suggestions as to how I can overcome my problem?
 
I've tried comparing images for blank bitmaps, but my problem is that I'm displaying a web page that is bigger than the visible area of the webBrowser control. The resulting image is black beyond the part of the image that would normally not be visible. Prior to drawing the bitmap I have tried resizing the form, but to no avail. If I scroll to the bottom of the web page before capturing the image I get just a blank white image (with black in the areas as above).
 

-- modified at 4:56 Thursday 23rd November, 2006
 
I have found a way to overcome my problems - my article at http://www.codeproject.com/useritems/XML_Tool.asp[^] provides code and a demo to show how I acheived this.
GeneralBlank Imagemembersides_dale8 Sep '06 - 16:32 
It seems the blank image problem revolves around pages that have iframes embedded in the html. For an example: try to run the program against http://www.microsoft.com you will get a perfect picture.
 
The only iframes in the page are created with javascript. This is created in lines 126 thru 128
<script type="text/javascript">var AdHtml='<iframe frameborder="0" scrolling="no" marginheight="0px" marginwidth="0px" allowtransparency="true" style="background:#F1F1F1" width="120" height="240" src="http://rad.microsoft.com/ADSAdClient31.dll?GetAd=&PG=CMSIE4&SC=F3&AP=1164"><'+'/iframe>';document.writeln(AdHtml);</script>

 
Now look at http://msdn.microsoft.com. You get a blank image and the difference is that there are embedded iframes in the html, that are not created with javascript. Look at lines 717 thru 719

<iframe frameborder="0" scrolling="no" marginheight="0px" marginwidth="0px" width="120" height="240" id="rad_CMSAD2F3" src="http://rad.microsoft.com/ADSAdClient31.dll?GetAd=&PG=CMSAD2&SC=F3&AP=1164"></iframe>

 
and lines 1090 thru 1092
 
<iframe frameborder="0" scrolling="no" marginheight="0px" marginwidth="0px" width="120" height="90" id="rad_CMSAD1F2" src="http://rad.microsoft.com/ADSAdClient31.dll?GetAd=&PG=CMSAD1&SC=F2&AP=1027"></iframe>

 
Not sure why this would cause a blank image because stepping through the code and it gets to the event handler for webBrowser1_DocumentCompleted you can do a ?webBrowser1.DocumentText and retrieve the text for the page, which means the complete page is retrieved, meaning that the page should have been rendered. So your guess is as good as mine on why it is not rendered to the bitmap.Confused | :confused:
GeneralRe: Blank Imagememberdooskoobi9 Sep '06 - 12:23 
Funny thing is that if you try to get page time after time you get it Confused | :confused: .At least it is true for me.
DocumentLoaded event is fired several times for each target url and
e.Url(WebBrowserDocumentCompletedEventArgs object) shows that those urls you mentioned(http://rad.microsoft.com/ADSAdClient31.dll etc ,but not only )are among them.I've created method getBaseUrl to get base Uri for given url something like:
Uri baseUri = null;
private Uri getBaseUrl(Uri request_url) {
WebRequest wrq = WebRequest.Create(request_url);
wrq.Method = "HEAD";
WebResponse wrsp = wrq.GetResponse();
Uri l_uri = wrsp.ResponseUri;
wrsp = null;
wrq = null;
return l_uri;
}
then I filter out all DocumentCompleted events for a given url except one for base url.
Something like if(e.Url==getBaseUrl(request uri) ) WebBrowser.drawToBitmap.
 

But still I get page from msdn.microsoft.com very irregulary.
There is a problem but I thinks it is not really related to iframes rather to microsoft security policies for those urls.May be i'm wrong .I'm still looking for answers.
Best regards
 


GeneralRe: Blank ImagememberNoah Nadeau4 Apr '08 - 5:51 
The problem doesn't seem to be solely associated with this project. I'm using a PDF Reader to generate a bitmap of a page, and it appears as though calling Control.DrawToBitmap() twice works in some scenarios, provided that the pdf isn't too graphically intensive.
GeneralRe: re "I can write history:"memberBillWoodruff26 Aug '06 - 0:59 
My apologies ! No insult to you intended.
 
It's nice to know that on this website you can erase history Smile | :)
 
best, Bill
 
"The greater the social and cultural distances between people, the more magical the light that can spring from their contact." Milan Kundera in Testaments Trahis

GeneralNice findmemberSimone Busoli23 Aug '06 - 6:58 
Thanks for the article, this is a very interesting find. I suggest you include the code proposed some posts below into the download, the visual rendering is much better.
 
Simone Busoli

GeneralRe: Nice findmemberdooskoobi24 Aug '06 - 8:49 
One day I will.Together with other enhancements.Not yet.
Cheers.
GeneralNice, but I've got a problemmemberDario Solera18 Aug '06 - 7:47 
Your demo application works just fine.
I tried to implement it myself, but I get an empty bitmap. Moreover, on MSDN the method WebControl.DrawToBitmap is marked as not supported.
Could you help me?
 
Thanks.
 
_____________________________________________
Tozzi is right: Gaia is getting rid of us.
Personal Blog [ITA] - Tech Blog [ENG]
Developing ScrewTurn Wiki 1.0 RC...

GeneralRe: Nice, but I've got a problemmemberdooskoobi18 Aug '06 - 9:17 
on http://msdn2.microsoft.com/en-us/library/system.windows.forms.control.drawtobitmap.aspx[^]is written that 'Control.DrawToBitmap Method
Note: This method is new in the .NET Framework version 2.0. '.
So it is supported by Net framework version 2.
 
About empty bitmap I had it with www.msn.com and for now have still no idea why I get it but I'm looking for an answer.So it's to to be continuedSmile | :)
 

GeneralRe: Nice, but I've got a problemmemberDario Solera18 Aug '06 - 10:19 
Thanks for your reply.
Control.DrawToBitmap should work fine for other controls, but WebBrowser.DrawToBitmap, inherited from WebBrowserBase.DrawToBitmap, is marked as not supported, as you can see here[^]. Still, your demo app works.
 
Anyway, I tried the same website with your demo app and with my code: in the first case it works, in the second one id doesn't, so I guess it's my fault.
I'll investigate...
 
Smile | :)
 
_____________________________________________
Tozzi is right: Gaia is getting rid of us.
Personal Blog [ITA] - Tech Blog [ENG]
Developing ScrewTurn Wiki 1.0 RC...

GeneralVery good!memberDan Letecky18 Aug '06 - 5:29 
I've seen webpage thumbnailers build with either Gtk# + Gecko and IE but none without drawing the browser window on screen.
 
You get my 5!
 
--
My sites for smart .NET developers:
DayPilot - Open-source Outlook-like calendar control for ASP.NET
DotLucene - The fastest open source fulltext search engine for .NET
Seekafile Server - Flexible open-source search server
DotNetFirebird - Using Firebird SQL in .NET

GeneralgetSizedImage() replacement, with better renderingmemberaxelriet17 Aug '06 - 13:10 
PicDoc.cs, line 37:
 
public Image getSizedImage(Image im)
{
   Bitmap bm = new Bitmap(picSize.getBaseSize.Width, picSize.getBaseSize.Height);
   using (Graphics g = Graphics.FromImage(bm)) {
      g.InterpolationMode = System.Drawing.Drawing2D.InterpolationMode.HighQualityBicubic;
      g.DrawImage(im, 0, 0, bm.Width, bm.Height);
   }
   return bm;
}

 
Cheers,
Axel
GeneralRe: getSizedImage() replacement, with better rendering [modified]memberdooskoobi18 Aug '06 - 7:43 
Good idea!
Difference is visible.
 
<small> 
     _/_/_/      _/_/       _/_/       _/_/_/   _/    _/     _/_/       _/_/      _/_/_/      _/
   _/    _/   _/    _/   _/    _/   _/         _/  _/     _/    _/   _/    _/   _/    _/     _/
  _/    _/   _/    _/   _/    _/     _/_/     _/_/       _/    _/   _/    _/   _/_/_/       _/
 _/    _/   _/    _/   _/    _/         _/   _/  _/     _/    _/   _/    _/   _/    _/     _/
_/_/_/       _/_/       _/_/     _/_/_/     _/    _/     _/_/       _/_/     _/_/_/     _/_/_/
</small>
 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web02 | 2.6.130523.1 | Last Updated 17 Aug 2006
Article Copyright 2006 by dooskoobi
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid