|
|||||||||||||||||||||
|
|||||||||||||||||||||
|
Announcements
Chapters
Services
Feature Zones
|
IntroductionMy application allows limited editing of HTML pages using MSHTML. Each HTML page is based on a template file and the range of things the end user can do to that template file is limited. At no time is the user able to create an empty HTML page.So obviously there has to be a mechanism in my application to allow the user to select which template a new page should be based on. I wanted to present the user with a list of thumbnail images, each representing a template page. In order to do that I had to devise a way of taking an HTML page and converting it to an image. The alternative of presenting the user with a simple listbox with the names of the templates is a tad too early 90's. This article is the result. A false startFortunately for me my application sets a specific size limit on page size. The entire page must fit into an 800 by 600 frame without scrollbars.
My initial approach was to render the page using MSHTML, create a memory bitmap, get a handle to the
MSHTML display window and do a It worked well but for one minor detail. In order to render an HTML page into an image file using the If you want to create images of something that's already on the screen well and good.
Otherwise, to create an image, you have to present that something on the screen. This makes
for an awful lot of flashing as one renders HTML pages to the screen for just long enough to grab their
bits via Even so, I was almost happy with the result. The flashing didn't look too awful. I even ran it past a few people, showing them what it looked like as it updated images and they didn't seem to mind it too much. But it irked me. There had to be a better way. A second approachSome digging around in MSDN revealed theIHTMLElementRender interface. Sounds hopeful. It
has a member function called DrawToDC() that sounds like a perfect fit. Which it is indeed.
Once you obtain an IHTMLElementRender interface you can supply your own device context and
get MSHTML to render the element to it. And once you've done that it's trivial to scale and save to a
file.
As you've probably guessed, it wasn't quite as simple as that. I'm going to present the class a little differently this time. We'll start with a simple version of the class (not present in the download) and add complexity to it as we encounter issues. The simple version of CCreateHTMLImagelooks like this.
class CCreateHTMLImage
{
public:
enum eOutputImageFormat
{
eBMP = 0,
eJPG,
eGIF,
eTIFF,
ePNG,
eImgSize
};
CCreateHTMLImage();
virtual ~CCreateHTMLImage();
BOOL SetSaveImageFormat(eOutputImageFormat format);
BOOL CreateImage(
IHTMLDocument2 *pDoc,
LPCTSTR szDestFilename,
CSize srcSize,
CSize outputSize);
protected:
int GetEncoderClsid(const WCHAR* format, CLSID* pClsid);
private:
static LPCTSTR m_ImageFormats[eImgSize];
CLSID m_encoderClsid;
};
This version of the class creates an image from an existing HTML document. The constructor initialises the saved image
format as a jpeg file (you can override this by calling SetSaveImageFormat() passing one of the
eOutputImageFormat constants). The guts of the work is done in the CreateImage() member
function which looks like this.
BOOL CCreateHTMLImage::CreateImage(
IHTMLDocument2 *pDoc,
LPCTSTR szDestFilename,
CSize srcSize,
CSize outputSize)
{
USES_CONVERSION;
ASSERT(szDestFilename);
ASSERT(AfxIsValidString(szDestFilename));
ASSERT(pDoc);
// Get our interfaces before we create anything else
IHTMLElement *pElement = (IHTMLElement *) NULL;
IHTMLElementRender *pRender = (IHTMLElementRender *) NULL;
// Let's be paranoid...
if (pDoc == (IHTMLElement *) NULL
return FALSE;
pDoc->get_body(&pElement);
if (pElement == (IHTMLElement *) NULL)
return FALSE;
pElement->QueryInterface(IID_IHTMLElementRender, (void **) &pRender);
if (pRender == (IHTMLElementRender *) NULL)
return FALSE;
CFileSpec fsDest(szDestFilename);
CBitmapDC destDC(srcSize.cx, srcSize.cy);
pRender->DrawToDC(destDC);
CBitmap *pBM = destDC.Close();
Bitmap *gdiBMP = Bitmap::FromHBITMAP(HBITMAP(pBM->GetSafeHandle()), NULL);
Image *gdiThumb = gdiBMP->GetimageImage(outputSize.cx, outputSize.cy);
gdiThumb->Save(T2W(fsDest.GetFullSpec()), &m_encoderClsid);
delete gdiBMP;
delete gdiThumb;
delete pBM;
return TRUE;
}
This takes a pointer to an IHTMLDocument2 interface, an output filename and a couple of CSize objects.
You'd have obtained the IHTMLDocument2 interface from a loaded HTML document in an instance of MSHTML somewhere in
your program. For example, if you wanted to create an image of the document in an app that used CHtmlView you'd
obtain the interface by calling GetHTMLDocument() on that view.
We do my usual bunch of If we got this far without encountering an error it's time to create the device context we want to paint the document into.
For this I used Anneke Sicherer-Roetman's excellent Once MSHTML has rendered our The other members of this class take care of the details of the saved image format and, since they're The Hang on a moment!Surely this class presents exactly the same problems as the false start approach discussed above? It can only create an image from an existing HTML document already on the screen. That's right. But this is the simple version of the class.If we want to create an image from a document stored somewhere else (hard disk or intranet or internet) we have to do a
little more work. We have to load the document using MSHTML, get an The full version of CCreateHTMLImagewhich is included in the download, looks like this.
class CCreateHTMLImage : public CWnd
{
protected:
DECLARE_DYNCREATE(CCreateHTMLImage)
DECLARE_EVENTSINK_MAP()
enum eEnums
{
CHILDBROWSER = 100,
};
public:
enum eOutputImageFormat
{
eBMP = 0,
eJPG,
eGIF,
eTIFF,
ePNG,
eImgSize
};
CCreateHTMLImage();
virtual ~CCreateHTMLImage();
BOOL Create(CWnd *pParent);
BOOL SetSaveImageFormat(eOutputImageFormat format);
BOOL CreateImage(
IHTMLDocument2 *pDoc,
LPCTSTR szDestFilename,
CSize srcSize,
CSize outputSize);
BOOL CreateImage(
LPCTSTR szSrcFilename,
LPCTSTR szDestFilename,
CSize srcSize,
CSize outputSize);
protected:
CComPtr
A few changes should jump out at you. The first is that the full version of the class is derived from CWnd whereas
the simple version wasn't. This indicates that at least some of the changes I made to allow the conversion of an HTML document
to an image somehow involve the creation of a window. You don't yet know the half of it!
All functions that were present in the simple version of the class are unchanged in the full version. You'll see that I
added another This new function is the reason I added all the new stuff to the full version of the class, so let's start with it and work outwards. Loading an external documentInitially I started out trying to use theIHTMLDocument2 interface directly. Something like this.
IHTMLDocument2 *pDoc = (IHTMLDocument2 *) NULL;
if (CoCreateInstance(
CLSID_HTMLDocument,
NULL,
CLSCTX_INPROC_SERVER,
IID_IHTMLDocument2,
(void**) &pDoc) == S_OK)
{
if (pDoc != (IHTMLDocument2 *) NULL)
{
// Do stuff
}
}
This works and we get a document interface we can work with. There's one small problem. There's no way to load a document
directly. We can call IHTMLDocument2::write() to render a string containing HTML but that means we have to load
our document contents into a string. That'll work just fine with local files but what if you want to image a website on the
net? All I want is to create images - not write a full blown http: protocol handler.
Ok, scratch that approach. What about using an So I coded it up and tested. The Repeating the test on a dummy application based on Hmm, so what are we doing differently? Well the first and most obvious difference is that we're instantiating an instance
of It was time to investigate how CHtmlViewis an MFC class. Fortunately we have the source code to MFC. That means we can go look at a working example of something and figure out what we're doing wrong or not doing at all.The first thing we find (in
BOOL CHtmlView::Create(LPCTSTR lpszClassName, LPCTSTR lpszWindowName,
DWORD dwStyle, const RECT& rect, CWnd* pParentWnd,
UINT nID, CCreateContext* pContext)
{
// create the view window itself
m_pCreateContext = pContext;
if (!CView::Create(lpszClassName, lpszWindowName,
dwStyle, rect, pParentWnd, nID, pContext))
{
return FALSE;
}
// assure that control containment is on
AfxEnableControlContainer();
RECT rectClient;
GetClientRect(&rectClient);
// create the control window
// AFX_IDW_PANE_FIRST is a safe but arbitrary ID
if (!m_wndBrowser.CreateControl(CLSID_WebBrowser, lpszWindowName,
WS_VISIBLE | WS_CHILD, rectClient, this, AFX_IDW_PANE_FIRST))
{
DestroyWindow();
return FALSE;
}
// cache the dispinterface
LPUNKNOWN lpUnk = m_wndBrowser.GetControlUnknown();
HRESULT hr = lpUnk->QueryInterface(IID_IWebBrowser2, (void**) &m_pBrowserApp);
if (!SUCCEEDED(hr))
{
m_pBrowserApp = NULL;
m_wndBrowser.DestroyWindow();
DestroyWindow();
return FALSE;
}
return TRUE;
}
The view window creates itself and then creates a child control as an ActiveX object using the CLSID_WebBrowser
identifier. If that succeeds it queries the child for the Web Browser's IUnknown interface and uses that interface
to get an IWebBrowser2 interface which it caches away for later use.
Ok, things are starting to fall into place. Instead of blindly creating an First lesson from CHtmlViewLet's duplicate what Our creation sequence is (if we want to create images for pages we haven't already got loaded in some instance of MSHTML somewhere):
Navigate2() on the Web Browser child window and expect the document to load. Which
it does. Let's have a look at the function.
BOOL CCreateHTMLImage::CreateImage(
LPCTSTR szSrcFilename,
LPCTSTR szDestFilename,
CSize srcSize,
CSize outputSize)
{
ASSERT(GetSafeHwnd());
ASSERT(IsWindow(GetSafeHwnd()));
ASSERT(szSrcFilename);
ASSERT(AfxIsValidString(szSrcFilename));
ASSERT(szDestFilename);
ASSERT(AfxIsValidString(szDestFilename));
CRect rect(CPoint(0, 0), srcSize);
// The WebBrowswer window size must be set to our srcSize
// else it won't render everything
MoveWindow(&rect);
m_pBrowserWnd.MoveWindow(&rect);
COleVariant vUrl(szSrcFilename, VT_BSTR),
vFlags(long(navNoHistory |
navNoReadFromCache |
navNoWriteToCache), VT_I4),
vNull(LPCTSTR(NULL), VT_BSTR);
COleSafeArray vPostData;
if (m_pBrowser->Navigate2(&vUrl, &vFlags, &vNull, &vPostData, &vNull) == S_OK)
// We have to pump messages to ensure the event handler (DocumentComplete)
// is called.
RunModalLoop();
else
return FALSE;
// We only get here when DocumentComplete has been called, which calls
// EndModalLoop and causes RunModalLoop to exit.
IDispatch *pDoc = (IDispatch *) NULL;
HRESULT hr = m_pBrowser->get_Document(&pDoc);
if (FAILED(hr))
return FALSE;
return CreateImage((IHTMLDocument2 *) pDoc, szDestFilename, srcSize, outputSize);
}
If we get through the gauntlet of my usual ASSERT checks on the input parameters we create a rectangle
with the dimensions implied by the srcSize parameter and set the Web Browser to those dimensions. If we don't
set the Web Browser size correctly we won't get an image that accurately reflects the contents of the HTML document.
Then we set up a bunch of COleVariant objects with our source document name, some flags and call the
Navigate2() method. If that method succeeds we fall into a call to the CWnd::RunModalLoop() function.
This is very important. My first stabs at this solution used a combination of So we start the navigation and then drop into a So why do we have an embedded Browser Window instead of being a Web Browser ourselves?If you've made it this far you might be wondering why our class mimics It wasn't obvious to me as I wrote the code that this would be necessary. I wrote the class so that it, itself, became an instance of the Web Browser and was able to create images of HTML documents without those documents ever flashing up on the screen. It all looked good. Trouble in paradiseBut on closer examination of the images I suddenly realised there were a couple of artifacts that shouldn't have been there. Scrollbars! If you've used It turns out that the reason I'll concede that you probably do have control over the parent object and could implement a Without going into an extended discussion of how
Unfortunately we can't use Phew! That's a lot of work just to get rid of some scrollbars on a tiny thumbnail image. But remember that the class can be used to capture full sized images simply by specifying the output image size. On a full sized image the scrollbars are probably undesirable. Using the classIs almost trivial. Include the header file, declare an instance ofCCreateHTMLImage and use it. Remember that there
are two ways to use it. The first is when you want to capture an image of an existing page already rendered somewhere in your
application.
CCreateHTMLImage cht;
cht.CreateImage(m_pDoc, csOutputFile, CSize(800, 600), CSize(80, 60));
which assumes that m_pDoc is a pointer to an IHTMLDocument2 interface. This example captures the image
at 800 by 600 but saves a thumbnail of 80 by 60.
The second way to use the class is when all you have is a filename or URL to the page you want to capture.
CCreateHTMLImage cht;
cht.Create(this);
cht.CreateImage(csSourceFile, csOutputFile, CSize(800, 600), CSize(80, 60));
which does the same except that it takes care of loading the source file (or URL) and then captures the image output to a
file. The Create() function needs a pointer to a CWnd derived object which must be a top-level window.
I use my CMainFrame window.
Oh, don't forget to initialise GDI+ - you can find out how in this excellent article[^] DependenciesThe class uses some other code found on CodeProject.History4 April 2004 - Initial Version.4 April 2004 - Updated the download to include the required header files. 9 April 2004 - Added a demo project written by Jubjub[^] 19 May 2004 - Updated the demo project. | ||||||||||||||||||||